Overview

Brought to you by YData

Dataset statistics

Number of variables112
Number of observations186529
Missing cells6512992
Missing cells (%)31.2%
Total size in memory159.4 MiB
Average record size in memory896.0 B

Variable types

Text112

Dataset

DescriptionBotany Division, Yale Peabody Museum 0061682-241126133413365
URLhttps://doi.org/10.15468/dl.twf535

Alerts

accessRights has constant value "Open Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj" Constant
language has constant value "en" Constant
license has constant value "CC0_1_0" Constant
publisher has constant value "Yale University Peabody Museum" Constant
rightsHolder has constant value "Yale Peabody Museum" Constant
type has constant value "PhysicalObject" Constant
institutionCode has constant value "YPM" Constant
ownerInstitutionCode has constant value "YPM" Constant
basisOfRecord has constant value "PRESERVED_SPECIMEN" Constant
individualCount has constant value "1" Constant
occurrenceStatus has constant value "PRESENT" Constant
preparations has constant value "tissue (frozen)" Constant
disposition has constant value "in collection" Constant
nomenclaturalCode has constant value "ICBN" Constant
taxonRemarks has constant value "Animals and Plants: Plants" Constant
datasetKey has constant value "963f12d0-f762-11e1-a439-00145eb45e9a" Constant
publishingCountry has constant value "US" Constant
protocol has constant value "EML" Constant
lastCrawled has constant value "2025-01-07T13:01:58.967Z" Constant
isSequenced has constant value "false" Constant
publishedByGbifRegion has constant value "NORTH_AMERICA" Constant
recordNumber has 139017 (74.5%) missing values Missing
recordedBy has 75764 (40.6%) missing values Missing
reproductiveCondition has 186504 (> 99.9%) missing values Missing
preparations has 186476 (> 99.9%) missing values Missing
associatedReferences has 176462 (94.6%) missing values Missing
associatedTaxa has 185782 (99.6%) missing values Missing
eventDate has 84019 (45.0%) missing values Missing
startDayOfYear has 103374 (55.4%) missing values Missing
endDayOfYear has 103374 (55.4%) missing values Missing
year has 84248 (45.2%) missing values Missing
month has 93636 (50.2%) missing values Missing
day has 104750 (56.2%) missing values Missing
habitat has 157729 (84.6%) missing values Missing
higherGeography has 72099 (38.7%) missing values Missing
continent has 73143 (39.2%) missing values Missing
waterBody has 183495 (98.4%) missing values Missing
countryCode has 72482 (38.9%) missing values Missing
stateProvince has 78016 (41.8%) missing values Missing
county has 98586 (52.9%) missing values Missing
municipality has 110052 (59.0%) missing values Missing
locality has 125307 (67.2%) missing values Missing
verbatimElevation has 178933 (95.9%) missing values Missing
decimalLatitude has 82100 (44.0%) missing values Missing
decimalLongitude has 82100 (44.0%) missing values Missing
coordinateUncertaintyInMeters has 82138 (44.0%) missing values Missing
georeferencedBy has 182211 (97.7%) missing values Missing
georeferencedDate has 174887 (93.8%) missing values Missing
georeferenceProtocol has 82331 (44.1%) missing values Missing
georeferenceSources has 83888 (45.0%) missing values Missing
georeferenceRemarks has 85474 (45.8%) missing values Missing
typeStatus has 182608 (97.9%) missing values Missing
identifiedBy has 180415 (96.7%) missing values Missing
dateIdentified has 184582 (99.0%) missing values Missing
identificationRemarks has 182833 (98.0%) missing values Missing
phylum has 28431 (15.2%) missing values Missing
class has 28457 (15.3%) missing values Missing
order has 28496 (15.3%) missing values Missing
family has 28710 (15.4%) missing values Missing
genus has 28788 (15.4%) missing values Missing
genericName has 28825 (15.5%) missing values Missing
specificEpithet has 54371 (29.1%) missing values Missing
infraspecificEpithet has 182164 (97.7%) missing values Missing
elevation has 178933 (95.9%) missing values Missing
elevationAccuracy has 185793 (99.6%) missing values Missing
distanceFromCentroidInMeters has 186092 (99.8%) missing values Missing
mediaType has 9347 (5.0%) missing values Missing
phylumKey has 28431 (15.2%) missing values Missing
classKey has 28457 (15.3%) missing values Missing
orderKey has 28496 (15.3%) missing values Missing
familyKey has 28710 (15.4%) missing values Missing
genusKey has 28788 (15.4%) missing values Missing
speciesKey has 54335 (29.1%) missing values Missing
species has 54335 (29.1%) missing values Missing
repatriated has 72482 (38.9%) missing values Missing
gbifRegion has 72484 (38.9%) missing values Missing
level0Gid has 86228 (46.2%) missing values Missing
level0Name has 86228 (46.2%) missing values Missing
level1Gid has 86228 (46.2%) missing values Missing
level1Name has 86228 (46.2%) missing values Missing
level2Gid has 87766 (47.1%) missing values Missing
level2Name has 87766 (47.1%) missing values Missing
level3Gid has 178900 (95.9%) missing values Missing
level3Name has 178901 (95.9%) missing values Missing
iucnRedListCategory has 10881 (5.8%) missing values Missing
gbifID has unique values Unique
bibliographicCitation has unique values Unique
references has unique values Unique
dynamicProperties has unique values Unique
occurrenceID has unique values Unique
catalogNumber has unique values Unique

Reproduction

Analysis started2025-01-08 23:32:03.397922
Analysis finished2025-01-08 23:32:14.028777
Duration10.63 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:14.247857image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1865290
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st row1038985783
2nd row1038985820
3rd row1038985793
4th row1805296727
5th row4539832816
ValueCountFrequency (%)
1038985783 1
 
< 0.1%
1038985974 1
 
< 0.1%
1038985864 1
 
< 0.1%
1805437104 1
 
< 0.1%
1038985828 1
 
< 0.1%
1038985793 1
 
< 0.1%
1805296727 1
 
< 0.1%
4539832816 1
 
< 0.1%
1038985782 1
 
< 0.1%
1038985792 1
 
< 0.1%
Other values (186519) 186519
> 99.9%
2025-01-08T18:32:14.593959image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 257934
13.8%
8 257321
13.8%
3 244061
13.1%
1 238732
12.8%
9 237662
12.7%
4 158719
8.5%
5 155085
8.3%
2 126849
6.8%
6 103154
 
5.5%
7 85773
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1865290
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 257934
13.8%
8 257321
13.8%
3 244061
13.1%
1 238732
12.8%
9 237662
12.7%
4 158719
8.5%
5 155085
8.3%
2 126849
6.8%
6 103154
 
5.5%
7 85773
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Common 1865290
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 257934
13.8%
8 257321
13.8%
3 244061
13.1%
1 238732
12.8%
9 237662
12.7%
4 158719
8.5%
5 155085
8.3%
2 126849
6.8%
6 103154
 
5.5%
7 85773
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1865290
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 257934
13.8%
8 257321
13.8%
3 244061
13.1%
1 238732
12.8%
9 237662
12.7%
4 158719
8.5%
5 155085
8.3%
2 126849
6.8%
6 103154
 
5.5%
7 85773
 
4.6%

accessRights
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:14.677477image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length129
Median length129
Mean length129
Min length129

Characters and Unicode

Total characters24062241
Distinct characters38
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOpen Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj
2nd rowOpen Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj
3rd rowOpen Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj
4th rowOpen Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj
5th rowOpen Access, http://creativecommons.org/publicdomain/zero/1.0/; see Yale Peabody policies at: http://hdl.handle.net/10079/8931zqj
ValueCountFrequency (%)
open 186529
11.1%
access 186529
11.1%
http://creativecommons.org/publicdomain/zero/1.0 186529
11.1%
see 186529
11.1%
yale 186529
11.1%
peabody 186529
11.1%
policies 186529
11.1%
at 186529
11.1%
http://hdl.handle.net/10079/8931zqj 186529
11.1%
2025-01-08T18:32:14.808457image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2238348
 
9.3%
/ 1865290
 
7.8%
1492232
 
6.2%
t 1305703
 
5.4%
o 1305703
 
5.4%
a 1119174
 
4.7%
c 1119174
 
4.7%
i 932645
 
3.9%
n 932645
 
3.9%
s 932645
 
3.9%
Other values (28) 10818682
45.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16228023
67.4%
Other Punctuation 3544051
 
14.7%
Decimal Number 2051819
 
8.5%
Space Separator 1492232
 
6.2%
Uppercase Letter 746116
 
3.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2238348
13.8%
t 1305703
 
8.0%
o 1305703
 
8.0%
a 1119174
 
6.9%
c 1119174
 
6.9%
i 932645
 
5.7%
n 932645
 
5.7%
s 932645
 
5.7%
l 932645
 
5.7%
p 932645
 
5.7%
Other values (12) 4476696
27.6%
Decimal Number
ValueCountFrequency (%)
1 559587
27.3%
0 559587
27.3%
9 373058
18.2%
8 186529
 
9.1%
7 186529
 
9.1%
3 186529
 
9.1%
Other Punctuation
ValueCountFrequency (%)
/ 1865290
52.6%
. 746116
 
21.1%
: 559587
 
15.8%
; 186529
 
5.3%
, 186529
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
P 186529
25.0%
O 186529
25.0%
Y 186529
25.0%
A 186529
25.0%
Space Separator
ValueCountFrequency (%)
1492232
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16974139
70.5%
Common 7088102
29.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2238348
13.2%
t 1305703
 
7.7%
o 1305703
 
7.7%
a 1119174
 
6.6%
c 1119174
 
6.6%
i 932645
 
5.5%
n 932645
 
5.5%
s 932645
 
5.5%
l 932645
 
5.5%
p 932645
 
5.5%
Other values (16) 5222812
30.8%
Common
ValueCountFrequency (%)
/ 1865290
26.3%
1492232
21.1%
. 746116
 
10.5%
: 559587
 
7.9%
1 559587
 
7.9%
0 559587
 
7.9%
9 373058
 
5.3%
8 186529
 
2.6%
7 186529
 
2.6%
3 186529
 
2.6%
Other values (2) 373058
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24062241
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2238348
 
9.3%
/ 1865290
 
7.8%
1492232
 
6.2%
t 1305703
 
5.4%
o 1305703
 
5.4%
a 1119174
 
4.7%
c 1119174
 
4.7%
i 932645
 
3.9%
n 932645
 
3.9%
s 932645
 
3.9%
Other values (28) 10818682
45.0%
Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:15.091994image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length55
Mean length28.163299
Min length15

Characters and Unicode

Total characters5253272
Distinct characters68
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st rowLuzula bulbosa (YU.036650)
2nd rowGentiana clausa (CBS.028950)
3rd rowCarex muhlenbergii (YU.070008)
4th rowLophocolea minor (YU.204399)
5th rowPlantae (YU.175465)
ValueCountFrequency (%)
plantae 28374
 
5.5%
carex 8803
 
1.7%
var 3699
 
0.7%
dryopteris 2392
 
0.5%
sphagnum 2360
 
0.5%
juncus 1814
 
0.4%
frullania 1708
 
0.3%
asplenium 1557
 
0.3%
scapania 1517
 
0.3%
canadensis 1515
 
0.3%
Other values (197634) 462834
89.6%
2025-01-08T18:32:15.456010image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 389969
 
7.4%
330044
 
6.3%
i 262458
 
5.0%
0 223252
 
4.2%
e 205819
 
3.9%
l 196972
 
3.7%
. 190701
 
3.6%
( 186530
 
3.6%
) 186530
 
3.6%
r 175234
 
3.3%
Other values (58) 2905763
55.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2641527
50.3%
Decimal Number 1119258
21.3%
Uppercase Letter 597975
 
11.4%
Space Separator 330044
 
6.3%
Other Punctuation 190703
 
3.6%
Open Punctuation 186530
 
3.6%
Close Punctuation 186530
 
3.6%
Dash Punctuation 705
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 389969
14.8%
i 262458
9.9%
e 205819
 
7.8%
l 196972
 
7.5%
r 175234
 
6.6%
n 170772
 
6.5%
u 162576
 
6.2%
o 156926
 
5.9%
s 154857
 
5.9%
t 146008
 
5.5%
Other values (16) 619936
23.5%
Uppercase Letter
ValueCountFrequency (%)
U 149178
24.9%
Y 148189
24.8%
C 64443
10.8%
S 55381
 
9.3%
P 49351
 
8.3%
B 44799
 
7.5%
A 13951
 
2.3%
L 10862
 
1.8%
D 7781
 
1.3%
R 6989
 
1.2%
Other values (16) 47051
 
7.9%
Decimal Number
ValueCountFrequency (%)
0 223252
19.9%
2 150900
13.5%
1 131446
11.7%
3 102608
9.2%
4 92196
8.2%
5 86628
 
7.7%
8 85549
 
7.6%
7 83994
 
7.5%
6 83823
 
7.5%
9 78862
 
7.0%
Other Punctuation
ValueCountFrequency (%)
. 190701
> 99.9%
? 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
330044
100.0%
Open Punctuation
ValueCountFrequency (%)
( 186530
100.0%
Close Punctuation
ValueCountFrequency (%)
) 186530
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 705
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3239502
61.7%
Common 2013770
38.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 389969
 
12.0%
i 262458
 
8.1%
e 205819
 
6.4%
l 196972
 
6.1%
r 175234
 
5.4%
n 170772
 
5.3%
u 162576
 
5.0%
o 156926
 
4.8%
s 154857
 
4.8%
U 149178
 
4.6%
Other values (42) 1214741
37.5%
Common
ValueCountFrequency (%)
330044
16.4%
0 223252
11.1%
. 190701
9.5%
( 186530
9.3%
) 186530
9.3%
2 150900
7.5%
1 131446
 
6.5%
3 102608
 
5.1%
4 92196
 
4.6%
5 86628
 
4.3%
Other values (6) 332935
16.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5253272
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 389969
 
7.4%
330044
 
6.3%
i 262458
 
5.0%
0 223252
 
4.2%
e 205819
 
3.9%
l 196972
 
3.7%
. 190701
 
3.6%
( 186530
 
3.6%
) 186530
 
3.6%
r 175234
 
3.3%
Other values (58) 2905763
55.3%

language
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:15.512207image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters373058
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowen
2nd rowen
3rd rowen
4th rowen
5th rowen
ValueCountFrequency (%)
en 186529
100.0%
2025-01-08T18:32:15.612191image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 186529
50.0%
n 186529
50.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 373058
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 186529
50.0%
n 186529
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 373058
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 186529
50.0%
n 186529
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 373058
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 186529
50.0%
n 186529
50.0%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:15.655784image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters1305703
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0_1_0
2nd rowCC0_1_0
3rd rowCC0_1_0
4th rowCC0_1_0
5th rowCC0_1_0
ValueCountFrequency (%)
cc0_1_0 186529
100.0%
2025-01-08T18:32:15.757147image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 373058
28.6%
0 373058
28.6%
_ 373058
28.6%
1 186529
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 559587
42.9%
Uppercase Letter 373058
28.6%
Connector Punctuation 373058
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 373058
66.7%
1 186529
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 373058
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 373058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 932645
71.4%
Latin 373058
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 373058
40.0%
_ 373058
40.0%
1 186529
20.0%
Latin
ValueCountFrequency (%)
C 373058
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1305703
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 373058
28.6%
0 373058
28.6%
_ 373058
28.6%
1 186529
14.3%
Distinct7024
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:15.895716image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters3730580
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1517 ?
Unique (%)0.8%

Sample

1st row2023-03-01T19:35:25Z
2nd row2020-10-02T23:17:12Z
3rd row2020-12-23T21:50:47Z
4th row2020-06-26T23:18:45Z
5th row2024-03-19T11:52:47Z
ValueCountFrequency (%)
2015-11-29t17:24:32z 16880
 
9.0%
2020-12-23t21:50:47z 9978
 
5.3%
2020-08-11t23:38:35z 9456
 
5.1%
2020-10-02t23:17:12z 6413
 
3.4%
2022-03-19t21:48:41z 5153
 
2.8%
2015-11-29t17:24:36z 5077
 
2.7%
2019-12-07t23:19:07z 4868
 
2.6%
2015-11-28t13:37:37z 3604
 
1.9%
2015-11-28t13:37:48z 3531
 
1.9%
2024-03-20t22:00:25z 3149
 
1.7%
Other values (7014) 118420
63.5%
2025-01-08T18:32:16.105704image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 718500
19.3%
0 500102
13.4%
1 466016
12.5%
- 373058
10.0%
: 373058
10.0%
3 227531
 
6.1%
4 206600
 
5.5%
T 186529
 
5.0%
Z 186529
 
5.0%
5 159192
 
4.3%
Other values (4) 333465
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2611406
70.0%
Dash Punctuation 373058
 
10.0%
Other Punctuation 373058
 
10.0%
Uppercase Letter 373058
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 718500
27.5%
0 500102
19.2%
1 466016
17.8%
3 227531
 
8.7%
4 206600
 
7.9%
5 159192
 
6.1%
7 107197
 
4.1%
8 80992
 
3.1%
9 79600
 
3.0%
6 65676
 
2.5%
Uppercase Letter
ValueCountFrequency (%)
T 186529
50.0%
Z 186529
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 373058
100.0%
Other Punctuation
ValueCountFrequency (%)
: 373058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3357522
90.0%
Latin 373058
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 718500
21.4%
0 500102
14.9%
1 466016
13.9%
- 373058
11.1%
: 373058
11.1%
3 227531
 
6.8%
4 206600
 
6.2%
5 159192
 
4.7%
7 107197
 
3.2%
8 80992
 
2.4%
Other values (2) 145276
 
4.3%
Latin
ValueCountFrequency (%)
T 186529
50.0%
Z 186529
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3730580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 718500
19.3%
0 500102
13.4%
1 466016
12.5%
- 373058
10.0%
: 373058
10.0%
3 227531
 
6.1%
4 206600
 
5.5%
T 186529
 
5.0%
Z 186529
 
5.0%
5 159192
 
4.3%
Other values (4) 333465
8.9%

publisher
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:16.171566image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length30
Median length30
Mean length30
Min length30

Characters and Unicode

Total characters5595870
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYale University Peabody Museum
2nd rowYale University Peabody Museum
3rd rowYale University Peabody Museum
4th rowYale University Peabody Museum
5th rowYale University Peabody Museum
ValueCountFrequency (%)
yale 186529
25.0%
university 186529
25.0%
peabody 186529
25.0%
museum 186529
25.0%
2025-01-08T18:32:16.279405image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 746116
 
13.3%
559587
 
10.0%
s 373058
 
6.7%
y 373058
 
6.7%
u 373058
 
6.7%
i 373058
 
6.7%
a 373058
 
6.7%
M 186529
 
3.3%
d 186529
 
3.3%
o 186529
 
3.3%
Other values (10) 1865290
33.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4290167
76.7%
Uppercase Letter 746116
 
13.3%
Space Separator 559587
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 746116
17.4%
s 373058
 
8.7%
y 373058
 
8.7%
u 373058
 
8.7%
i 373058
 
8.7%
a 373058
 
8.7%
d 186529
 
4.3%
o 186529
 
4.3%
b 186529
 
4.3%
t 186529
 
4.3%
Other values (5) 932645
21.7%
Uppercase Letter
ValueCountFrequency (%)
M 186529
25.0%
P 186529
25.0%
Y 186529
25.0%
U 186529
25.0%
Space Separator
ValueCountFrequency (%)
559587
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5036283
90.0%
Common 559587
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 746116
14.8%
s 373058
 
7.4%
y 373058
 
7.4%
u 373058
 
7.4%
i 373058
 
7.4%
a 373058
 
7.4%
M 186529
 
3.7%
d 186529
 
3.7%
o 186529
 
3.7%
b 186529
 
3.7%
Other values (9) 1678761
33.3%
Common
ValueCountFrequency (%)
559587
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5595870
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 746116
 
13.3%
559587
 
10.0%
s 373058
 
6.7%
y 373058
 
6.7%
u 373058
 
6.7%
i 373058
 
6.7%
a 373058
 
6.7%
M 186529
 
3.3%
d 186529
 
3.3%
o 186529
 
3.3%
Other values (10) 1865290
33.3%

references
Text

Unique 

Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:16.437729image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length59
Mean length59.20648264
Min length59

Characters and Unicode

Total characters11043726
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st rowhttp://collections.peabody.yale.edu/search/Record/YU.036650
2nd rowhttp://collections.peabody.yale.edu/search/Record/CBS.028950
3rd rowhttp://collections.peabody.yale.edu/search/Record/YU.070008
4th rowhttp://collections.peabody.yale.edu/search/Record/YU.204399
5th rowhttp://collections.peabody.yale.edu/search/Record/YU.175465
ValueCountFrequency (%)
http://collections.peabody.yale.edu/search/record/yu.036650 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.065082 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.065678 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.234842 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.012442 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.070008 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.204399 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.175465 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.060443 1
 
< 0.1%
http://collections.peabody.yale.edu/search/record/yu.038995 1
 
< 0.1%
Other values (186519) 186519
> 99.9%
2025-01-08T18:32:16.671337image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1119174
 
10.1%
/ 932645
 
8.4%
. 746144
 
6.8%
c 746116
 
6.8%
o 746116
 
6.8%
l 559587
 
5.1%
a 559587
 
5.1%
t 559587
 
5.1%
d 559587
 
5.1%
h 373058
 
3.4%
Other values (25) 4142125
37.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7461160
67.6%
Other Punctuation 1865318
 
16.9%
Decimal Number 1119258
 
10.1%
Uppercase Letter 597990
 
5.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1119174
15.0%
c 746116
10.0%
o 746116
10.0%
l 559587
 
7.5%
a 559587
 
7.5%
t 559587
 
7.5%
d 559587
 
7.5%
h 373058
 
5.0%
y 373058
 
5.0%
p 373058
 
5.0%
Other values (6) 1492232
20.0%
Decimal Number
ValueCountFrequency (%)
0 223252
19.9%
2 150900
13.5%
1 131446
11.7%
3 102608
9.2%
4 92196
8.2%
5 86628
 
7.7%
8 85549
 
7.6%
7 83994
 
7.5%
6 83823
 
7.5%
9 78862
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
R 186529
31.2%
Y 148126
24.8%
U 148126
24.8%
C 38403
 
6.4%
B 38403
 
6.4%
S 38403
 
6.4%
Other Punctuation
ValueCountFrequency (%)
/ 932645
50.0%
. 746144
40.0%
: 186529
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8059150
73.0%
Common 2984576
 
27.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1119174
13.9%
c 746116
 
9.3%
o 746116
 
9.3%
l 559587
 
6.9%
a 559587
 
6.9%
t 559587
 
6.9%
d 559587
 
6.9%
h 373058
 
4.6%
y 373058
 
4.6%
p 373058
 
4.6%
Other values (12) 2090222
25.9%
Common
ValueCountFrequency (%)
/ 932645
31.2%
. 746144
25.0%
0 223252
 
7.5%
: 186529
 
6.2%
2 150900
 
5.1%
1 131446
 
4.4%
3 102608
 
3.4%
4 92196
 
3.1%
5 86628
 
2.9%
8 85549
 
2.9%
Other values (3) 246679
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11043726
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1119174
 
10.1%
/ 932645
 
8.4%
. 746144
 
6.8%
c 746116
 
6.8%
o 746116
 
6.8%
l 559587
 
5.1%
a 559587
 
5.1%
t 559587
 
5.1%
d 559587
 
5.1%
h 373058
 
3.4%
Other values (25) 4142125
37.5%

rightsHolder
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:16.727734image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters3544051
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYale Peabody Museum
2nd rowYale Peabody Museum
3rd rowYale Peabody Museum
4th rowYale Peabody Museum
5th rowYale Peabody Museum
ValueCountFrequency (%)
yale 186529
33.3%
peabody 186529
33.3%
museum 186529
33.3%
2025-01-08T18:32:16.826565image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 559587
15.8%
a 373058
10.5%
373058
10.5%
u 373058
10.5%
Y 186529
 
5.3%
l 186529
 
5.3%
P 186529
 
5.3%
b 186529
 
5.3%
o 186529
 
5.3%
d 186529
 
5.3%
Other values (4) 746116
21.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2611406
73.7%
Uppercase Letter 559587
 
15.8%
Space Separator 373058
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 559587
21.4%
a 373058
14.3%
u 373058
14.3%
l 186529
 
7.1%
b 186529
 
7.1%
o 186529
 
7.1%
d 186529
 
7.1%
y 186529
 
7.1%
s 186529
 
7.1%
m 186529
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%
Space Separator
ValueCountFrequency (%)
373058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3170993
89.5%
Common 373058
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 559587
17.6%
a 373058
11.8%
u 373058
11.8%
Y 186529
 
5.9%
l 186529
 
5.9%
P 186529
 
5.9%
b 186529
 
5.9%
o 186529
 
5.9%
d 186529
 
5.9%
y 186529
 
5.9%
Other values (3) 559587
17.6%
Common
ValueCountFrequency (%)
373058
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3544051
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 559587
15.8%
a 373058
10.5%
373058
10.5%
u 373058
10.5%
Y 186529
 
5.3%
l 186529
 
5.3%
P 186529
 
5.3%
b 186529
 
5.3%
o 186529
 
5.3%
d 186529
 
5.3%
Other values (4) 746116
21.1%

type
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:16.873567image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters2611406
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPhysicalObject
2nd rowPhysicalObject
3rd rowPhysicalObject
4th rowPhysicalObject
5th rowPhysicalObject
ValueCountFrequency (%)
physicalobject 186529
100.0%
2025-01-08T18:32:16.972545image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 373058
14.3%
P 186529
 
7.1%
h 186529
 
7.1%
y 186529
 
7.1%
s 186529
 
7.1%
i 186529
 
7.1%
a 186529
 
7.1%
l 186529
 
7.1%
O 186529
 
7.1%
b 186529
 
7.1%
Other values (3) 559587
21.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2238348
85.7%
Uppercase Letter 373058
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 373058
16.7%
h 186529
8.3%
y 186529
8.3%
s 186529
8.3%
i 186529
8.3%
a 186529
8.3%
l 186529
8.3%
b 186529
8.3%
j 186529
8.3%
e 186529
8.3%
Uppercase Letter
ValueCountFrequency (%)
P 186529
50.0%
O 186529
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2611406
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 373058
14.3%
P 186529
 
7.1%
h 186529
 
7.1%
y 186529
 
7.1%
s 186529
 
7.1%
i 186529
 
7.1%
a 186529
 
7.1%
l 186529
 
7.1%
O 186529
 
7.1%
b 186529
 
7.1%
Other values (3) 559587
21.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2611406
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 373058
14.3%
P 186529
 
7.1%
h 186529
 
7.1%
y 186529
 
7.1%
s 186529
 
7.1%
i 186529
 
7.1%
a 186529
 
7.1%
l 186529
 
7.1%
O 186529
 
7.1%
b 186529
 
7.1%
Other values (3) 559587
21.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:17.012051image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters186529
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row1
ValueCountFrequency (%)
1 177440
95.1%
0 9089
 
4.9%
2025-01-08T18:32:17.103893image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 177440
95.1%
0 9089
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 186529
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 177440
95.1%
0 9089
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
Common 186529
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 177440
95.1%
0 9089
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 186529
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 177440
95.1%
0 9089
 
4.9%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:17.143017image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters559587
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYPM
2nd rowYPM
3rd rowYPM
4th rowYPM
5th rowYPM
ValueCountFrequency (%)
ypm 186529
100.0%
2025-01-08T18:32:17.235504image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 559587
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 559587
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 559587
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:17.275897image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.205882195
Min length2

Characters and Unicode

Total characters411461
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYU
2nd rowCBS
3rd rowYU
4th rowYU
5th rowYU
ValueCountFrequency (%)
yu 148126
79.4%
cbs 38403
 
20.6%
2025-01-08T18:32:17.371789image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 411461
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 411461
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 411461
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%

ownerInstitutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:17.504821image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters559587
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYPM
2nd rowYPM
3rd rowYPM
4th rowYPM
5th rowYPM
ValueCountFrequency (%)
ypm 186529
100.0%
2025-01-08T18:32:17.597701image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 559587
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 559587
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 559587
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 186529
33.3%
P 186529
33.3%
M 186529
33.3%

basisOfRecord
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:17.644329image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters3357522
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESERVED_SPECIMEN
2nd rowPRESERVED_SPECIMEN
3rd rowPRESERVED_SPECIMEN
4th rowPRESERVED_SPECIMEN
5th rowPRESERVED_SPECIMEN
ValueCountFrequency (%)
preserved_specimen 186529
100.0%
2025-01-08T18:32:17.744011image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 932645
27.8%
P 373058
 
11.1%
R 373058
 
11.1%
S 373058
 
11.1%
V 186529
 
5.6%
D 186529
 
5.6%
_ 186529
 
5.6%
C 186529
 
5.6%
I 186529
 
5.6%
M 186529
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3170993
94.4%
Connector Punctuation 186529
 
5.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 932645
29.4%
P 373058
 
11.8%
R 373058
 
11.8%
S 373058
 
11.8%
V 186529
 
5.9%
D 186529
 
5.9%
C 186529
 
5.9%
I 186529
 
5.9%
M 186529
 
5.9%
N 186529
 
5.9%
Connector Punctuation
ValueCountFrequency (%)
_ 186529
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3170993
94.4%
Common 186529
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 932645
29.4%
P 373058
 
11.8%
R 373058
 
11.8%
S 373058
 
11.8%
V 186529
 
5.9%
D 186529
 
5.9%
C 186529
 
5.9%
I 186529
 
5.9%
M 186529
 
5.9%
N 186529
 
5.9%
Common
ValueCountFrequency (%)
_ 186529
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3357522
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 932645
27.8%
P 373058
 
11.1%
R 373058
 
11.1%
S 373058
 
11.1%
V 186529
 
5.6%
D 186529
 
5.6%
_ 186529
 
5.6%
C 186529
 
5.6%
I 186529
 
5.6%
M 186529
 
5.6%

dynamicProperties
Text

Unique 

Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:18.078693image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length468
Median length364
Mean length129.8176048
Min length20

Characters and Unicode

Total characters24214748
Distinct characters44
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st row{ "irn": "1284160", "media": "1049200:23e7a3e4-d0b0-4e83-9ff2-192065f61a5a", "mm_repository_id": "1049200" }
2nd row{ "irn": "1377942", "media": "109412:2de8b571-4db4-4d56-aca6-faa3477edb7c", "mm_repository_id": "109412", "solr_long_lat": "-72.2664,41.4854" }
3rd row{ "irn": "908073", "solr_long_lat": "-72.9316,41.4070" }
4th row{ "irn": "1892063", "media": "268631:3adf8b86-2732-45cd-aef6-c1ead71bd726", "mm_repository_id": "268631", "solr_long_lat": "-119,51" }
5th row{ "irn": "2463858", "media": "1186778:f2d4000d-7289-44d9-bba3-f87582cd4f33 1186779:5b8ba8d4-ba11-4789-b865-bf0d163e1e42", "mm_repository_id": "1186778" }
ValueCountFrequency (%)
373805
22.1%
irn 186529
 
11.0%
mm_repository_id 177182
 
10.5%
media 177182
 
10.5%
solr_long_lat 104429
 
6.2%
72.9316,41.4070 1988
 
0.1%
72.920823,41.305111 1951
 
0.1%
72.9247,41.3114 1870
 
0.1%
72.88,41.6050 1661
 
0.1%
73.036,41.5583 1211
 
0.1%
Other values (569062) 662556
39.2%
2025-01-08T18:32:18.500506image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
" 2587264
 
10.7%
1503835
 
6.2%
1 1306746
 
5.4%
4 1080313
 
4.5%
2 1005309
 
4.2%
- 909018
 
3.8%
9 877072
 
3.6%
8 856889
 
3.5%
3 854143
 
3.5%
7 850393
 
3.5%
Other values (34) 12383766
51.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9188047
37.9%
Lowercase Letter 7460825
30.8%
Other Punctuation 4207683
17.4%
Space Separator 1503835
 
6.2%
Dash Punctuation 909018
 
3.8%
Connector Punctuation 566210
 
2.3%
Open Punctuation 186529
 
0.8%
Close Punctuation 186529
 
0.8%
Uppercase Letter 5686
 
< 0.1%
Math Symbol 386
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 736138
9.9%
d 734755
9.8%
i 718822
9.6%
a 709722
9.5%
r 649804
 
8.7%
o 564716
 
7.6%
m 531546
 
7.1%
b 425696
 
5.7%
c 377479
 
5.1%
f 376923
 
5.1%
Other values (8) 1635224
21.9%
Decimal Number
ValueCountFrequency (%)
1 1306746
14.2%
4 1080313
11.8%
2 1005309
10.9%
9 877072
9.5%
8 856889
9.3%
3 854143
9.3%
7 850393
9.3%
6 823084
9.0%
0 786476
8.6%
5 747622
8.1%
Uppercase Letter
ValueCountFrequency (%)
Y 2266
39.9%
P 1140
20.0%
M 1140
20.0%
U 1126
19.8%
A 7
 
0.1%
R 7
 
0.1%
Other Punctuation
ValueCountFrequency (%)
" 2587264
61.5%
: 847672
 
20.1%
, 564716
 
13.4%
. 208031
 
4.9%
Space Separator
ValueCountFrequency (%)
1503835
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 909018
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 566210
100.0%
Open Punctuation
ValueCountFrequency (%)
{ 186529
100.0%
Close Punctuation
ValueCountFrequency (%)
} 186529
100.0%
Math Symbol
ValueCountFrequency (%)
| 386
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16748237
69.2%
Latin 7466511
30.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 736138
9.9%
d 734755
9.8%
i 718822
9.6%
a 709722
9.5%
r 649804
 
8.7%
o 564716
 
7.6%
m 531546
 
7.1%
b 425696
 
5.7%
c 377479
 
5.1%
f 376923
 
5.0%
Other values (14) 1640910
22.0%
Common
ValueCountFrequency (%)
" 2587264
15.4%
1503835
 
9.0%
1 1306746
 
7.8%
4 1080313
 
6.5%
2 1005309
 
6.0%
- 909018
 
5.4%
9 877072
 
5.2%
8 856889
 
5.1%
3 854143
 
5.1%
7 850393
 
5.1%
Other values (10) 4917255
29.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24214748
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
" 2587264
 
10.7%
1503835
 
6.2%
1 1306746
 
5.4%
4 1080313
 
4.5%
2 1005309
 
4.2%
- 909018
 
3.8%
9 877072
 
3.6%
8 856889
 
3.5%
3 854143
 
3.5%
7 850393
 
3.5%
Other values (34) 12383766
51.1%

occurrenceID
Text

Unique 

Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:18.651107image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters8393805
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st rowurn:uuid:a15cbeaa-3fcd-4ec5-bfb1-27f0f8bc8910
2nd rowurn:uuid:a15e0d7e-5095-4a84-b02b-fe689f416389
3rd rowurn:uuid:a165d6f6-a6f1-4464-9d19-d307fba92359
4th rowurn:uuid:a1674501-cb24-4a3a-9ef8-4d0751ad4e63
5th rowurn:uuid:a169b221-8413-44a8-bccc-fa7045bf79df
ValueCountFrequency (%)
urn:uuid:a15cbeaa-3fcd-4ec5-bfb1-27f0f8bc8910 1
 
< 0.1%
urn:uuid:a1d1e7f6-c3fd-4cdf-92eb-181c3735610c 1
 
< 0.1%
urn:uuid:a19015bb-6550-4f6a-afda-a2f1f7015626 1
 
< 0.1%
urn:uuid:a276dcf5-b6fd-4a0e-a9c9-e3d67d274f2c 1
 
< 0.1%
urn:uuid:a18d197d-f3bd-4416-bdef-f4a9f2135f3e 1
 
< 0.1%
urn:uuid:a165d6f6-a6f1-4464-9d19-d307fba92359 1
 
< 0.1%
urn:uuid:a1674501-cb24-4a3a-9ef8-4d0751ad4e63 1
 
< 0.1%
urn:uuid:a169b221-8413-44a8-bccc-fa7045bf79df 1
 
< 0.1%
urn:uuid:a16fdf5e-d4db-44ab-8f13-95359c948f0c 1
 
< 0.1%
urn:uuid:a17057c5-3a20-44d8-b8bb-a4febbcf747a 1
 
< 0.1%
Other values (186519) 186519
> 99.9%
2025-01-08T18:32:18.861889image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 746116
 
8.9%
u 559587
 
6.7%
d 536327
 
6.4%
4 535641
 
6.4%
8 397325
 
4.7%
a 396721
 
4.7%
b 396278
 
4.7%
9 396248
 
4.7%
: 373058
 
4.4%
c 350587
 
4.2%
Other values (12) 3705917
44.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3777062
45.0%
Lowercase Letter 3497569
41.7%
Dash Punctuation 746116
 
8.9%
Other Punctuation 373058
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 559587
16.0%
d 536327
15.3%
a 396721
11.3%
b 396278
11.3%
c 350587
10.0%
f 349437
10.0%
e 349045
10.0%
r 186529
 
5.3%
i 186529
 
5.3%
n 186529
 
5.3%
Decimal Number
ValueCountFrequency (%)
4 535641
14.2%
8 397325
10.5%
9 396248
10.5%
1 350371
9.3%
6 349908
9.3%
7 349825
9.3%
5 349699
9.3%
3 349432
9.3%
0 349369
9.2%
2 349244
9.2%
Dash Punctuation
ValueCountFrequency (%)
- 746116
100.0%
Other Punctuation
ValueCountFrequency (%)
: 373058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4896236
58.3%
Latin 3497569
41.7%

Most frequent character per script

Common
ValueCountFrequency (%)
- 746116
15.2%
4 535641
10.9%
8 397325
8.1%
9 396248
8.1%
: 373058
7.6%
1 350371
7.2%
6 349908
7.1%
7 349825
7.1%
5 349699
7.1%
3 349432
7.1%
Other values (2) 698613
14.3%
Latin
ValueCountFrequency (%)
u 559587
16.0%
d 536327
15.3%
a 396721
11.3%
b 396278
11.3%
c 350587
10.0%
f 349437
10.0%
e 349045
10.0%
r 186529
 
5.3%
i 186529
 
5.3%
n 186529
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8393805
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 746116
 
8.9%
u 559587
 
6.7%
d 536327
 
6.4%
4 535641
 
6.4%
8 397325
 
4.7%
a 396721
 
4.7%
b 396278
 
4.7%
9 396248
 
4.7%
: 373058
 
4.4%
c 350587
 
4.2%
Other values (12) 3705917
44.2%

catalogNumber
Text

Unique 

Distinct186529
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:19.126015image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length9
Mean length9.206482638
Min length9

Characters and Unicode

Total characters1717276
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186529 ?
Unique (%)100.0%

Sample

1st rowYU.036650
2nd rowCBS.028950
3rd rowYU.070008
4th rowYU.204399
5th rowYU.175465
ValueCountFrequency (%)
yu.036650 1
 
< 0.1%
yu.065082 1
 
< 0.1%
yu.065678 1
 
< 0.1%
yu.234842 1
 
< 0.1%
yu.012442 1
 
< 0.1%
yu.070008 1
 
< 0.1%
yu.204399 1
 
< 0.1%
yu.175465 1
 
< 0.1%
yu.060443 1
 
< 0.1%
yu.038995 1
 
< 0.1%
Other values (186519) 186519
> 99.9%
2025-01-08T18:32:19.448097image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 223252
13.0%
. 186557
10.9%
2 150900
8.8%
Y 148126
8.6%
U 148126
8.6%
1 131446
 
7.7%
3 102608
 
6.0%
4 92196
 
5.4%
5 86628
 
5.0%
8 85549
 
5.0%
Other values (6) 361888
21.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1119258
65.2%
Uppercase Letter 411461
 
24.0%
Other Punctuation 186557
 
10.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 223252
19.9%
2 150900
13.5%
1 131446
11.7%
3 102608
9.2%
4 92196
8.2%
5 86628
 
7.7%
8 85549
 
7.6%
7 83994
 
7.5%
6 83823
 
7.5%
9 78862
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%
Other Punctuation
ValueCountFrequency (%)
. 186557
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1305815
76.0%
Latin 411461
 
24.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 223252
17.1%
. 186557
14.3%
2 150900
11.6%
1 131446
10.1%
3 102608
7.9%
4 92196
7.1%
5 86628
 
6.6%
8 85549
 
6.6%
7 83994
 
6.4%
6 83823
 
6.4%
Latin
ValueCountFrequency (%)
Y 148126
36.0%
U 148126
36.0%
C 38403
 
9.3%
B 38403
 
9.3%
S 38403
 
9.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1717276
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 223252
13.0%
. 186557
10.9%
2 150900
8.8%
Y 148126
8.6%
U 148126
8.6%
1 131446
 
7.7%
3 102608
 
6.0%
4 92196
 
5.4%
5 86628
 
5.0%
8 85549
 
5.0%
Other values (6) 361888
21.1%

recordNumber
Text

Missing 

Distinct13601
Distinct (%)28.6%
Missing139017
Missing (%)74.5%
Memory size1.4 MiB
2025-01-08T18:32:19.648154image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length20
Mean length3.446729247
Min length1

Characters and Unicode

Total characters163761
Distinct characters77
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7343 ?
Unique (%)15.5%

Sample

1st row4856
2nd row621
3rd row12
4th row545
5th row4616
ValueCountFrequency (%)
2 265
 
0.5%
1 234
 
0.5%
3 209
 
0.4%
4 207
 
0.4%
8 177
 
0.4%
6 176
 
0.4%
5 171
 
0.4%
7 163
 
0.3%
9 156
 
0.3%
10 150
 
0.3%
Other values (12986) 46388
96.0%
2025-01-08T18:32:19.913046image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 25684
15.7%
2 19997
12.2%
3 17169
10.5%
4 15274
9.3%
5 14907
9.1%
6 13292
8.1%
7 12976
7.9%
8 12576
7.7%
0 12513
7.6%
9 12410
7.6%
Other values (67) 6963
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 156798
95.7%
Lowercase Letter 2347
 
1.4%
Uppercase Letter 1734
 
1.1%
Other Punctuation 1266
 
0.8%
Space Separator 784
 
0.5%
Dash Punctuation 744
 
0.5%
Math Symbol 44
 
< 0.1%
Open Punctuation 22
 
< 0.1%
Close Punctuation 22
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 795
33.9%
p 631
26.9%
b 355
15.1%
c 92
 
3.9%
d 77
 
3.3%
u 63
 
2.7%
n 62
 
2.6%
e 49
 
2.1%
o 32
 
1.4%
r 26
 
1.1%
Other values (15) 165
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
S 188
10.8%
D 177
10.2%
X 156
 
9.0%
B 139
 
8.0%
I 130
 
7.5%
A 124
 
7.2%
P 115
 
6.6%
C 114
 
6.6%
E 104
 
6.0%
W 75
 
4.3%
Other values (15) 412
23.8%
Decimal Number
ValueCountFrequency (%)
1 25684
16.4%
2 19997
12.8%
3 17169
10.9%
4 15274
9.7%
5 14907
9.5%
6 13292
8.5%
7 12976
8.3%
8 12576
8.0%
0 12513
8.0%
9 12410
7.9%
Other Punctuation
ValueCountFrequency (%)
. 834
65.9%
/ 221
 
17.5%
, 138
 
10.9%
# 35
 
2.8%
& 16
 
1.3%
: 10
 
0.8%
? 6
 
0.5%
' 5
 
0.4%
; 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
= 22
50.0%
+ 22
50.0%
Open Punctuation
ValueCountFrequency (%)
( 20
90.9%
[ 2
 
9.1%
Close Punctuation
ValueCountFrequency (%)
) 20
90.9%
] 2
 
9.1%
Space Separator
ValueCountFrequency (%)
784
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 744
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 159680
97.5%
Latin 4081
 
2.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 795
19.5%
p 631
15.5%
b 355
 
8.7%
S 188
 
4.6%
D 177
 
4.3%
X 156
 
3.8%
B 139
 
3.4%
I 130
 
3.2%
A 124
 
3.0%
P 115
 
2.8%
Other values (40) 1271
31.1%
Common
ValueCountFrequency (%)
1 25684
16.1%
2 19997
12.5%
3 17169
10.8%
4 15274
9.6%
5 14907
9.3%
6 13292
8.3%
7 12976
8.1%
8 12576
7.9%
0 12513
7.8%
9 12410
7.8%
Other values (17) 2882
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 163761
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 25684
15.7%
2 19997
12.2%
3 17169
10.5%
4 15274
9.3%
5 14907
9.1%
6 13292
8.1%
7 12976
7.9%
8 12576
7.7%
0 12513
7.6%
9 12410
7.6%
Other values (67) 6963
 
4.3%

recordedBy
Text

Missing 

Distinct3451
Distinct (%)3.1%
Missing75764
Missing (%)40.6%
Memory size1.4 MiB
2025-01-08T18:32:20.107015image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length98
Median length94
Mean length16.95773033
Min length2

Characters and Unicode

Total characters1878323
Distinct characters80
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1506 ?
Unique (%)1.4%

Sample

1st rowCharles H. Bissell
2nd rowHoratio N. Fenn
3rd rowAlfred H. Brinkman
4th rowCharles C. Godfrey
5th rowCharles H. Bissell
ValueCountFrequency (%)
h 17884
 
5.3%
charles 16797
 
5.0%
w 13815
 
4.1%
e 13699
 
4.1%
a 9233
 
2.8%
george 9101
 
2.7%
bissell 8948
 
2.7%
c 7711
 
2.3%
nichols 6625
 
2.0%
b 6460
 
1.9%
Other values (2822) 225265
67.1%
2025-01-08T18:32:20.367493image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
224773
 
12.0%
e 165937
 
8.8%
r 129288
 
6.9%
a 119892
 
6.4%
l 112246
 
6.0%
. 107354
 
5.7%
n 97287
 
5.2%
s 80163
 
4.3%
i 77500
 
4.1%
o 75225
 
4.0%
Other values (70) 688658
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1200369
63.9%
Uppercase Letter 335975
 
17.9%
Space Separator 224773
 
12.0%
Other Punctuation 116139
 
6.2%
Decimal Number 799
 
< 0.1%
Close Punctuation 96
 
< 0.1%
Open Punctuation 96
 
< 0.1%
Dash Punctuation 76
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 165937
13.8%
r 129288
10.8%
a 119892
10.0%
l 112246
9.4%
n 97287
8.1%
s 80163
 
6.7%
i 77500
 
6.5%
o 75225
 
6.3%
h 59764
 
5.0%
t 52406
 
4.4%
Other values (21) 230661
19.2%
Uppercase Letter
ValueCountFrequency (%)
C 36558
10.9%
E 36255
10.8%
H 33567
10.0%
A 31547
9.4%
W 29410
 
8.8%
B 28312
 
8.4%
S 17816
 
5.3%
G 17569
 
5.2%
J 15225
 
4.5%
L 13747
 
4.1%
Other values (17) 75969
22.6%
Decimal Number
ValueCountFrequency (%)
1 400
50.1%
9 184
23.0%
4 107
 
13.4%
8 59
 
7.4%
3 20
 
2.5%
2 18
 
2.3%
5 6
 
0.8%
7 3
 
0.4%
6 2
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 107354
92.4%
, 4807
 
4.1%
; 3914
 
3.4%
' 53
 
< 0.1%
? 5
 
< 0.1%
& 4
 
< 0.1%
/ 2
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 91
94.8%
] 5
 
5.2%
Open Punctuation
ValueCountFrequency (%)
( 91
94.8%
[ 5
 
5.2%
Space Separator
ValueCountFrequency (%)
224773
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 76
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1536344
81.8%
Common 341979
 
18.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 165937
 
10.8%
r 129288
 
8.4%
a 119892
 
7.8%
l 112246
 
7.3%
n 97287
 
6.3%
s 80163
 
5.2%
i 77500
 
5.0%
o 75225
 
4.9%
h 59764
 
3.9%
t 52406
 
3.4%
Other values (48) 566636
36.9%
Common
ValueCountFrequency (%)
224773
65.7%
. 107354
31.4%
, 4807
 
1.4%
; 3914
 
1.1%
1 400
 
0.1%
9 184
 
0.1%
4 107
 
< 0.1%
) 91
 
< 0.1%
( 91
 
< 0.1%
- 76
 
< 0.1%
Other values (12) 182
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1878190
> 99.9%
None 133
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
224773
 
12.0%
e 165937
 
8.8%
r 129288
 
6.9%
a 119892
 
6.4%
l 112246
 
6.0%
. 107354
 
5.7%
n 97287
 
5.2%
s 80163
 
4.3%
i 77500
 
4.1%
o 75225
 
4.0%
Other values (64) 688525
36.7%
None
ValueCountFrequency (%)
á 122
91.7%
ö 4
 
3.0%
ô 4
 
3.0%
è 1
 
0.8%
É 1
 
0.8%
ä 1
 
0.8%

individualCount
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:20.418351image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters186529
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 186529
100.0%
2025-01-08T18:32:20.511958image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 186529
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 186529
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 186529
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 186529
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 186529
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 186529
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 186529
100.0%

reproductiveCondition
Text

Missing 

Distinct4
Distinct (%)16.0%
Missing186504
Missing (%)> 99.9%
Memory size1.4 MiB
2025-01-08T18:32:20.558297image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length9
Mean length10.28
Min length8

Characters and Unicode

Total characters257
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)4.0%

Sample

1st rowFlowering
2nd rowFlowering
3rd rowFlowering
4th rowFlowering & Fruiting.
5th rowFruiting
ValueCountFrequency (%)
flowering 20
62.5%
fruiting 6
 
18.8%
2
 
6.2%
male 1
 
3.1%
and 1
 
3.1%
female 1
 
3.1%
cones 1
 
3.1%
2025-01-08T18:32:20.657335image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 32
12.5%
n 28
10.9%
F 26
10.1%
r 26
10.1%
g 26
10.1%
e 24
9.3%
l 22
8.6%
o 21
8.2%
w 20
7.8%
7
 
2.7%
Other values (10) 25
9.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 220
85.6%
Uppercase Letter 26
 
10.1%
Space Separator 7
 
2.7%
Other Punctuation 4
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 32
14.5%
n 28
12.7%
r 26
11.8%
g 26
11.8%
e 24
10.9%
l 22
10.0%
o 21
9.5%
w 20
9.1%
t 6
 
2.7%
u 6
 
2.7%
Other values (6) 9
 
4.1%
Other Punctuation
ValueCountFrequency (%)
& 2
50.0%
. 2
50.0%
Uppercase Letter
ValueCountFrequency (%)
F 26
100.0%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 246
95.7%
Common 11
 
4.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 32
13.0%
n 28
11.4%
F 26
10.6%
r 26
10.6%
g 26
10.6%
e 24
9.8%
l 22
8.9%
o 21
8.5%
w 20
8.1%
t 6
 
2.4%
Other values (7) 15
6.1%
Common
ValueCountFrequency (%)
7
63.6%
& 2
 
18.2%
. 2
 
18.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 257
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 32
12.5%
n 28
10.9%
F 26
10.1%
r 26
10.1%
g 26
10.1%
e 24
9.3%
l 22
8.6%
o 21
8.2%
w 20
7.8%
7
 
2.7%
Other values (10) 25
9.7%

occurrenceStatus
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:20.699203image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters1305703
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESENT
2nd rowPRESENT
3rd rowPRESENT
4th rowPRESENT
5th rowPRESENT
ValueCountFrequency (%)
present 186529
100.0%
2025-01-08T18:32:20.792089image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 373058
28.6%
P 186529
14.3%
R 186529
14.3%
S 186529
14.3%
N 186529
14.3%
T 186529
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1305703
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 373058
28.6%
P 186529
14.3%
R 186529
14.3%
S 186529
14.3%
N 186529
14.3%
T 186529
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1305703
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 373058
28.6%
P 186529
14.3%
R 186529
14.3%
S 186529
14.3%
N 186529
14.3%
T 186529
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1305703
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 373058
28.6%
P 186529
14.3%
R 186529
14.3%
S 186529
14.3%
N 186529
14.3%
T 186529
14.3%

preparations
Text

Constant  Missing 

Distinct1
Distinct (%)1.9%
Missing186476
Missing (%)> 99.9%
Memory size1.4 MiB
2025-01-08T18:32:20.832625image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters795
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtissue (frozen)
2nd rowtissue (frozen)
3rd rowtissue (frozen)
4th rowtissue (frozen)
5th rowtissue (frozen)
ValueCountFrequency (%)
tissue 53
50.0%
frozen 53
50.0%
2025-01-08T18:32:20.925017image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 106
13.3%
e 106
13.3%
t 53
 
6.7%
i 53
 
6.7%
u 53
 
6.7%
53
 
6.7%
( 53
 
6.7%
f 53
 
6.7%
r 53
 
6.7%
o 53
 
6.7%
Other values (3) 159
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 636
80.0%
Space Separator 53
 
6.7%
Open Punctuation 53
 
6.7%
Close Punctuation 53
 
6.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 106
16.7%
e 106
16.7%
t 53
8.3%
i 53
8.3%
u 53
8.3%
f 53
8.3%
r 53
8.3%
o 53
8.3%
z 53
8.3%
n 53
8.3%
Space Separator
ValueCountFrequency (%)
53
100.0%
Open Punctuation
ValueCountFrequency (%)
( 53
100.0%
Close Punctuation
ValueCountFrequency (%)
) 53
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 636
80.0%
Common 159
 
20.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 106
16.7%
e 106
16.7%
t 53
8.3%
i 53
8.3%
u 53
8.3%
f 53
8.3%
r 53
8.3%
o 53
8.3%
z 53
8.3%
n 53
8.3%
Common
ValueCountFrequency (%)
53
33.3%
( 53
33.3%
) 53
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 795
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 106
13.3%
e 106
13.3%
t 53
 
6.7%
i 53
 
6.7%
u 53
 
6.7%
53
 
6.7%
( 53
 
6.7%
f 53
 
6.7%
r 53
 
6.7%
o 53
 
6.7%
Other values (3) 159
20.0%

disposition
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:20.969018image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters2424877
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowin collection
2nd rowin collection
3rd rowin collection
4th rowin collection
5th rowin collection
ValueCountFrequency (%)
in 186529
50.0%
collection 186529
50.0%
2025-01-08T18:32:21.066687image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 373058
15.4%
n 373058
15.4%
c 373058
15.4%
o 373058
15.4%
l 373058
15.4%
186529
7.7%
e 186529
7.7%
t 186529
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2238348
92.3%
Space Separator 186529
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 373058
16.7%
n 373058
16.7%
c 373058
16.7%
o 373058
16.7%
l 373058
16.7%
e 186529
8.3%
t 186529
8.3%
Space Separator
ValueCountFrequency (%)
186529
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2238348
92.3%
Common 186529
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 373058
16.7%
n 373058
16.7%
c 373058
16.7%
o 373058
16.7%
l 373058
16.7%
e 186529
8.3%
t 186529
8.3%
Common
ValueCountFrequency (%)
186529
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2424877
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 373058
15.4%
n 373058
15.4%
c 373058
15.4%
o 373058
15.4%
l 373058
15.4%
186529
7.7%
e 186529
7.7%
t 186529
7.7%

associatedReferences
Text

Missing 

Distinct3765
Distinct (%)37.4%
Missing176462
Missing (%)94.6%
Memory size1.4 MiB
2025-01-08T18:32:21.236935image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length481
Median length338
Mean length43.67040826
Min length1

Characters and Unicode

Total characters439630
Distinct characters93
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3122 ?
Unique (%)31.0%

Sample

1st rowDet. by: Martin C. Van Boskirk 1997|
2nd rowDet. by: Alexander W. Evans
3rd rowISOTYPE. Note: Proc. Amer. Acad. Arts. 22: 420. 1887.
4th rowISOSYNTYPE. Note: Mem. Amer. Acad. Arts. n.s. 520. 1862.
5th rowISOTYPE. Note: Pl. Wright. (Grisebach) 1: 173. 1860.
ValueCountFrequency (%)
by 6513
 
8.7%
det 6278
 
8.4%
note 4081
 
5.5%
isotype 2637
 
3.5%
of 1965
 
2.6%
w 1081
 
1.4%
the 1033
 
1.4%
syntype 884
 
1.2%
arts 839
 
1.1%
amer 784
 
1.0%
Other values (2983) 48652
65.1%
2025-01-08T18:32:21.484932image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
64680
 
14.7%
. 32693
 
7.4%
e 30865
 
7.0%
t 21244
 
4.8%
o 17185
 
3.9%
a 16258
 
3.7%
r 16177
 
3.7%
: 13805
 
3.1%
n 13259
 
3.0%
i 11346
 
2.6%
Other values (83) 202118
46.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 202214
46.0%
Uppercase Letter 77380
 
17.6%
Space Separator 64680
 
14.7%
Other Punctuation 47613
 
10.8%
Decimal Number 41766
 
9.5%
Math Symbol 4292
 
1.0%
Dash Punctuation 591
 
0.1%
Close Punctuation 546
 
0.1%
Open Punctuation 546
 
0.1%
Other Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 30865
15.3%
t 21244
10.5%
o 17185
 
8.5%
a 16258
 
8.0%
r 16177
 
8.0%
n 13259
 
6.6%
i 11346
 
5.6%
l 10399
 
5.1%
y 8787
 
4.3%
s 8678
 
4.3%
Other values (24) 48016
23.7%
Uppercase Letter
ValueCountFrequency (%)
D 6983
 
9.0%
P 6970
 
9.0%
E 6536
 
8.4%
S 6410
 
8.3%
A 6193
 
8.0%
N 6019
 
7.8%
Y 5458
 
7.1%
T 5403
 
7.0%
C 3719
 
4.8%
O 3662
 
4.7%
Other values (16) 20027
25.9%
Other Punctuation
ValueCountFrequency (%)
. 32693
68.7%
: 13805
29.0%
; 451
 
0.9%
, 420
 
0.9%
' 92
 
0.2%
? 81
 
0.2%
& 36
 
0.1%
" 27
 
0.1%
# 6
 
< 0.1%
/ 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 10167
24.3%
8 6112
14.6%
9 4977
11.9%
6 3913
 
9.4%
2 3487
 
8.3%
7 3041
 
7.3%
5 3040
 
7.3%
4 2459
 
5.9%
3 2393
 
5.7%
0 2177
 
5.2%
Math Symbol
ValueCountFrequency (%)
| 4244
98.9%
= 44
 
1.0%
+ 4
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 380
69.6%
] 164
30.0%
} 2
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 379
69.4%
[ 165
30.2%
{ 2
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 563
95.3%
28
 
4.7%
Space Separator
ValueCountFrequency (%)
64680
100.0%
Other Symbol
ValueCountFrequency (%)
° 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 279594
63.6%
Common 160036
36.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 30865
 
11.0%
t 21244
 
7.6%
o 17185
 
6.1%
a 16258
 
5.8%
r 16177
 
5.8%
n 13259
 
4.7%
i 11346
 
4.1%
l 10399
 
3.7%
y 8787
 
3.1%
s 8678
 
3.1%
Other values (50) 125396
44.8%
Common
ValueCountFrequency (%)
64680
40.4%
. 32693
20.4%
: 13805
 
8.6%
1 10167
 
6.4%
8 6112
 
3.8%
9 4977
 
3.1%
| 4244
 
2.7%
6 3913
 
2.4%
2 3487
 
2.2%
7 3041
 
1.9%
Other values (23) 12917
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 439412
> 99.9%
None 190
 
< 0.1%
Punctuation 28
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
64680
 
14.7%
. 32693
 
7.4%
e 30865
 
7.0%
t 21244
 
4.8%
o 17185
 
3.9%
a 16258
 
3.7%
r 16177
 
3.7%
: 13805
 
3.1%
n 13259
 
3.0%
i 11346
 
2.6%
Other values (73) 201900
45.9%
None
ValueCountFrequency (%)
á 125
65.8%
ü 26
 
13.7%
é 23
 
12.1%
ö 8
 
4.2%
ä 2
 
1.1%
è 2
 
1.1%
° 2
 
1.1%
ë 1
 
0.5%
ñ 1
 
0.5%
Punctuation
ValueCountFrequency (%)
28
100.0%

associatedTaxa
Text

Missing 

Distinct745
Distinct (%)99.7%
Missing185782
Missing (%)99.6%
Memory size1.4 MiB
2025-01-08T18:32:21.659018image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length109
Median length21
Mean length29.93574297
Min length9

Characters and Unicode

Total characters22362
Distinct characters32
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique743 ?
Unique (%)99.5%

Sample

1st rowsame sheet: YU.064497|same sheet: YU.064498|same sheet: YU.064500
2nd rowsame sheet: YU.064978
3rd rowYU.000992
4th rowsame sheet: YU.064670
5th rowsame sheet: YU.001167
ValueCountFrequency (%)
sheet 965
35.8%
same 649
24.1%
replicate 9
 
0.3%
yu.065496|same 5
 
0.2%
yu.014017|same 5
 
0.2%
yu.014019|same 5
 
0.2%
yu.014020|same 5
 
0.2%
yu.014022 5
 
0.2%
yu.065492 5
 
0.2%
yu.065494|same 5
 
0.2%
Other values (832) 1037
38.5%
2025-01-08T18:32:21.886895image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2925
13.1%
1948
 
8.7%
s 1930
 
8.6%
0 1853
 
8.3%
6 1270
 
5.7%
. 1134
 
5.1%
Y 1133
 
5.1%
U 1126
 
5.0%
t 983
 
4.4%
: 983
 
4.4%
Other values (22) 7077
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8823
39.5%
Decimal Number 6801
30.4%
Uppercase Letter 2287
 
10.2%
Other Punctuation 2117
 
9.5%
Space Separator 1948
 
8.7%
Math Symbol 386
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2925
33.2%
s 1930
21.9%
t 983
 
11.1%
a 977
 
11.1%
h 971
 
11.0%
m 965
 
10.9%
r 18
 
0.2%
p 12
 
0.1%
c 12
 
0.1%
i 12
 
0.1%
Other values (2) 18
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 1853
27.2%
6 1270
18.7%
5 693
 
10.2%
1 596
 
8.8%
4 587
 
8.6%
2 450
 
6.6%
9 375
 
5.5%
7 364
 
5.4%
3 322
 
4.7%
8 291
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
Y 1133
49.5%
U 1126
49.2%
A 7
 
0.3%
P 7
 
0.3%
R 7
 
0.3%
M 7
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 1134
53.6%
: 983
46.4%
Space Separator
ValueCountFrequency (%)
1948
100.0%
Math Symbol
ValueCountFrequency (%)
| 386
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11252
50.3%
Latin 11110
49.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2925
26.3%
s 1930
17.4%
Y 1133
 
10.2%
U 1126
 
10.1%
t 983
 
8.8%
a 977
 
8.8%
h 971
 
8.7%
m 965
 
8.7%
r 18
 
0.2%
p 12
 
0.1%
Other values (8) 70
 
0.6%
Common
ValueCountFrequency (%)
1948
17.3%
0 1853
16.5%
6 1270
11.3%
. 1134
10.1%
: 983
8.7%
5 693
 
6.2%
1 596
 
5.3%
4 587
 
5.2%
2 450
 
4.0%
| 386
 
3.4%
Other values (4) 1352
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22362
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2925
13.1%
1948
 
8.7%
s 1930
 
8.6%
0 1853
 
8.3%
6 1270
 
5.7%
. 1134
 
5.1%
Y 1133
 
5.1%
U 1126
 
5.0%
t 983
 
4.4%
: 983
 
4.4%
Other values (22) 7077
31.6%
Distinct186516
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:22.106328image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length509
Median length29
Mean length32.71075811
Min length24

Characters and Unicode

Total characters6101505
Distinct characters89
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186503 ?
Unique (%)> 99.9%

Sample

1st rowYU number 36650; lot count 1
2nd rowCBS number 28950; lot count 1
3rd rowYU number 70008; lot count 1
4th rowYU number 204399; lot count 1
5th rowYU number 175465; lot count 1
ValueCountFrequency (%)
1 186654
15.5%
number 186532
15.5%
lot 186530
15.5%
count 186529
15.5%
yu 148138
12.3%
cbs 38404
 
3.2%
tall 1591
 
0.1%
dryopteris 1419
 
0.1%
ca 1393
 
0.1%
carex 1306
 
0.1%
Other values (156716) 268264
22.2%
2025-01-08T18:32:22.394739image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1020231
16.7%
o 410162
 
6.7%
t 410146
 
6.7%
n 405993
 
6.7%
u 403292
 
6.6%
1 323313
 
5.3%
e 243622
 
4.0%
r 227752
 
3.7%
l 226680
 
3.7%
; 215837
 
3.5%
Other values (79) 2214477
36.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3196751
52.4%
Decimal Number 1190785
 
19.5%
Space Separator 1020231
 
16.7%
Uppercase Letter 457401
 
7.5%
Other Punctuation 232829
 
3.8%
Math Symbol 2789
 
< 0.1%
Dash Punctuation 563
 
< 0.1%
Close Punctuation 64
 
< 0.1%
Open Punctuation 64
 
< 0.1%
Connector Punctuation 27
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 410162
12.8%
t 410146
12.8%
n 405993
12.7%
u 403292
12.6%
e 243622
7.6%
r 227752
7.1%
l 226680
7.1%
c 212563
6.6%
m 212319
6.6%
b 193316
6.0%
Other values (18) 250906
7.8%
Uppercase Letter
ValueCountFrequency (%)
Y 151690
33.2%
U 149387
32.7%
C 43689
 
9.6%
S 40747
 
8.9%
B 39685
 
8.7%
P 6322
 
1.4%
A 4630
 
1.0%
D 3750
 
0.8%
M 3438
 
0.8%
H 2160
 
0.5%
Other values (16) 11903
 
2.6%
Other Punctuation
ValueCountFrequency (%)
; 215837
92.7%
. 11916
 
5.1%
, 3123
 
1.3%
: 1459
 
0.6%
& 336
 
0.1%
/ 76
 
< 0.1%
' 40
 
< 0.1%
" 24
 
< 0.1%
% 7
 
< 0.1%
? 7
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 323313
27.2%
2 153590
12.9%
3 104319
 
8.8%
4 93865
 
7.9%
0 89929
 
7.6%
5 89211
 
7.5%
8 86892
 
7.3%
6 85406
 
7.2%
7 84783
 
7.1%
9 79477
 
6.7%
Math Symbol
ValueCountFrequency (%)
= 2773
99.4%
~ 5
 
0.2%
< 4
 
0.1%
+ 4
 
0.1%
> 3
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 562
99.8%
1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 63
98.4%
] 1
 
1.6%
Open Punctuation
ValueCountFrequency (%)
( 63
98.4%
[ 1
 
1.6%
Space Separator
ValueCountFrequency (%)
1020231
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 27
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3654152
59.9%
Common 2447353
40.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 410162
11.2%
t 410146
11.2%
n 405993
11.1%
u 403292
11.0%
e 243622
 
6.7%
r 227752
 
6.2%
l 226680
 
6.2%
c 212563
 
5.8%
m 212319
 
5.8%
b 193316
 
5.3%
Other values (44) 708307
19.4%
Common
ValueCountFrequency (%)
1020231
41.7%
1 323313
 
13.2%
; 215837
 
8.8%
2 153590
 
6.3%
3 104319
 
4.3%
4 93865
 
3.8%
0 89929
 
3.7%
5 89211
 
3.6%
8 86892
 
3.6%
6 85406
 
3.5%
Other values (25) 184760
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6101443
> 99.9%
None 61
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1020231
16.7%
o 410162
 
6.7%
t 410146
 
6.7%
n 405993
 
6.7%
u 403292
 
6.6%
1 323313
 
5.3%
e 243622
 
4.0%
r 227752
 
3.7%
l 226680
 
3.7%
; 215837
 
3.5%
Other values (75) 2214415
36.3%
None
ValueCountFrequency (%)
á 30
49.2%
ñ 30
49.2%
1
 
1.6%
Punctuation
ValueCountFrequency (%)
1
100.0%
Distinct17165
Distinct (%)9.2%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-08T18:32:22.590658image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length172
Median length139
Mean length16.42291339
Min length3

Characters and Unicode

Total characters3063054
Distinct characters59
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8494 ?
Unique (%)4.6%

Sample

1st rowLuzula bulbosa
2nd rowGentiana clausa
3rd rowCarex muhlenbergii|Carex muhlenbergii
4th rowLophocolea minor
5th rowPlantae
ValueCountFrequency (%)
plantae 28374
 
8.5%
carex 8803
 
2.6%
var 4014
 
1.2%
dryopteris 2392
 
0.7%
sphagnum 2360
 
0.7%
juncus 1814
 
0.5%
frullania 1708
 
0.5%
asplenium 1557
 
0.5%
scapania 1517
 
0.5%
canadensis 1511
 
0.5%
Other values (14275) 280732
83.9%
2025-01-08T18:32:22.863654image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 399621
13.0%
i 270647
 
8.8%
e 211263
 
6.9%
l 201909
 
6.6%
r 180523
 
5.9%
n 175073
 
5.7%
u 167028
 
5.5%
o 161624
 
5.3%
s 159635
 
5.2%
t 149512
 
4.9%
Other values (49) 986219
32.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2714576
88.6%
Uppercase Letter 190758
 
6.2%
Space Separator 148271
 
4.8%
Other Punctuation 4477
 
0.1%
Math Symbol 4244
 
0.1%
Dash Punctuation 726
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 399621
14.7%
i 270647
10.0%
e 211263
 
7.8%
l 201909
 
7.4%
r 180523
 
6.7%
n 175073
 
6.4%
u 167028
 
6.2%
o 161624
 
6.0%
s 159635
 
5.9%
t 149512
 
5.5%
Other values (16) 637741
23.5%
Uppercase Letter
ValueCountFrequency (%)
P 49853
26.1%
C 26611
14.0%
S 17407
 
9.1%
A 14516
 
7.6%
L 11096
 
5.8%
D 7958
 
4.2%
R 7175
 
3.8%
E 7079
 
3.7%
B 6558
 
3.4%
M 6188
 
3.2%
Other values (16) 36317
19.0%
Other Punctuation
ValueCountFrequency (%)
. 4475
> 99.9%
? 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
148271
100.0%
Math Symbol
ValueCountFrequency (%)
| 4244
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 726
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2905334
94.9%
Common 157720
 
5.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 399621
13.8%
i 270647
 
9.3%
e 211263
 
7.3%
l 201909
 
6.9%
r 180523
 
6.2%
n 175073
 
6.0%
u 167028
 
5.7%
o 161624
 
5.6%
s 159635
 
5.5%
t 149512
 
5.1%
Other values (42) 828499
28.5%
Common
ValueCountFrequency (%)
148271
94.0%
. 4475
 
2.8%
| 4244
 
2.7%
- 726
 
0.5%
? 2
 
< 0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3063054
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 399621
13.0%
i 270647
 
8.8%
e 211263
 
6.9%
l 201909
 
6.6%
r 180523
 
5.9%
n 175073
 
5.7%
u 167028
 
5.5%
o 161624
 
5.3%
s 159635
 
5.2%
t 149512
 
4.9%
Other values (49) 986219
32.2%

eventDate
Text

Missing 

Distinct19106
Distinct (%)18.6%
Missing84019
Missing (%)45.0%
Memory size1.4 MiB
2025-01-08T18:32:23.056402image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length9.320905278
Min length4

Characters and Unicode

Total characters955486
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7461 ?
Unique (%)7.3%

Sample

1st row1919-10-01
2nd row1822
3rd row1909-05-27
4th row1905-07-23
5th row1901-09-02
ValueCountFrequency (%)
1822 660
 
0.6%
1920 497
 
0.5%
1914 302
 
0.3%
1875 288
 
0.3%
1893 280
 
0.3%
1902-08-20/1902-08-25 228
 
0.2%
1859 225
 
0.2%
1876 213
 
0.2%
1915 208
 
0.2%
1862 205
 
0.2%
Other values (19096) 99404
97.0%
2025-01-08T18:32:23.314190image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 179522
18.8%
1 178315
18.7%
0 166698
17.4%
9 118068
12.4%
8 72267
7.6%
2 64463
 
6.7%
7 42922
 
4.5%
6 36805
 
3.9%
3 35566
 
3.7%
5 34423
 
3.6%
Other values (2) 26437
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 774588
81.1%
Dash Punctuation 179522
 
18.8%
Other Punctuation 1376
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 178315
23.0%
0 166698
21.5%
9 118068
15.2%
8 72267
9.3%
2 64463
 
8.3%
7 42922
 
5.5%
6 36805
 
4.8%
3 35566
 
4.6%
5 34423
 
4.4%
4 25061
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 179522
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1376
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 955486
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 179522
18.8%
1 178315
18.7%
0 166698
17.4%
9 118068
12.4%
8 72267
7.6%
2 64463
 
6.7%
7 42922
 
4.5%
6 36805
 
3.9%
3 35566
 
3.7%
5 34423
 
3.6%
Other values (2) 26437
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 955486
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 179522
18.8%
1 178315
18.7%
0 166698
17.4%
9 118068
12.4%
8 72267
7.6%
2 64463
 
6.7%
7 42922
 
4.5%
6 36805
 
3.9%
3 35566
 
3.7%
5 34423
 
3.6%
Other values (2) 26437
 
2.8%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.4%
Missing103374
Missing (%)55.4%
Memory size1.4 MiB
2025-01-08T18:32:23.524811image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.925536648
Min length1

Characters and Unicode

Total characters243273
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row274
2nd row147
3rd row204
4th row245
5th row175
ValueCountFrequency (%)
232 837
 
1.0%
150 753
 
0.9%
201 751
 
0.9%
186 717
 
0.9%
249 701
 
0.8%
185 700
 
0.8%
200 669
 
0.8%
193 651
 
0.8%
172 624
 
0.8%
151 607
 
0.7%
Other values (356) 76145
91.6%
2025-01-08T18:32:23.789074image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 55350
22.8%
1 51427
21.1%
3 20962
 
8.6%
5 18239
 
7.5%
6 17248
 
7.1%
4 17190
 
7.1%
0 16075
 
6.6%
9 15802
 
6.5%
8 15598
 
6.4%
7 15382
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 243273
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 55350
22.8%
1 51427
21.1%
3 20962
 
8.6%
5 18239
 
7.5%
6 17248
 
7.1%
4 17190
 
7.1%
0 16075
 
6.6%
9 15802
 
6.5%
8 15598
 
6.4%
7 15382
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
Common 243273
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 55350
22.8%
1 51427
21.1%
3 20962
 
8.6%
5 18239
 
7.5%
6 17248
 
7.1%
4 17190
 
7.1%
0 16075
 
6.6%
9 15802
 
6.5%
8 15598
 
6.4%
7 15382
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 243273
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 55350
22.8%
1 51427
21.1%
3 20962
 
8.6%
5 18239
 
7.5%
6 17248
 
7.1%
4 17190
 
7.1%
0 16075
 
6.6%
9 15802
 
6.5%
8 15598
 
6.4%
7 15382
 
6.3%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.4%
Missing103374
Missing (%)55.4%
Memory size1.4 MiB
2025-01-08T18:32:23.993265image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.925717034
Min length1

Characters and Unicode

Total characters243288
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row274
2nd row147
3rd row204
4th row245
5th row175
ValueCountFrequency (%)
201 752
 
0.9%
150 742
 
0.9%
249 720
 
0.9%
186 718
 
0.9%
237 689
 
0.8%
185 688
 
0.8%
200 677
 
0.8%
193 670
 
0.8%
172 626
 
0.8%
232 608
 
0.7%
Other values (356) 76265
91.7%
2025-01-08T18:32:24.253401image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 55269
22.7%
1 51209
21.0%
3 20925
 
8.6%
5 18234
 
7.5%
6 17278
 
7.1%
4 17165
 
7.1%
0 16024
 
6.6%
9 15862
 
6.5%
7 15690
 
6.4%
8 15632
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 243288
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 55269
22.7%
1 51209
21.0%
3 20925
 
8.6%
5 18234
 
7.5%
6 17278
 
7.1%
4 17165
 
7.1%
0 16024
 
6.6%
9 15862
 
6.5%
7 15690
 
6.4%
8 15632
 
6.4%

Most occurring scripts

ValueCountFrequency (%)
Common 243288
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 55269
22.7%
1 51209
21.0%
3 20925
 
8.6%
5 18234
 
7.5%
6 17278
 
7.1%
4 17165
 
7.1%
0 16024
 
6.6%
9 15862
 
6.5%
7 15690
 
6.4%
8 15632
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 243288
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 55269
22.7%
1 51209
21.0%
3 20925
 
8.6%
5 18234
 
7.5%
6 17278
 
7.1%
4 17165
 
7.1%
0 16024
 
6.6%
9 15862
 
6.5%
7 15690
 
6.4%
8 15632
 
6.4%

year
Text

Missing 

Distinct206
Distinct (%)0.2%
Missing84248
Missing (%)45.2%
Memory size1.4 MiB
2025-01-08T18:32:24.456933image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters409124
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st row1919
2nd row1822
3rd row1909
4th row1905
5th row1901
ValueCountFrequency (%)
1903 3557
 
3.5%
1908 3427
 
3.4%
1906 3121
 
3.1%
1909 3107
 
3.0%
1907 2461
 
2.4%
1905 2453
 
2.4%
1902 2442
 
2.4%
1904 2252
 
2.2%
1901 2247
 
2.2%
1910 2160
 
2.1%
Other values (196) 75054
73.4%
2025-01-08T18:32:24.825149image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 125652
30.7%
9 95339
23.3%
8 44654
 
10.9%
0 40689
 
9.9%
2 25566
 
6.2%
3 20238
 
4.9%
7 16894
 
4.1%
5 14797
 
3.6%
6 13450
 
3.3%
4 11845
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 409124
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 125652
30.7%
9 95339
23.3%
8 44654
 
10.9%
0 40689
 
9.9%
2 25566
 
6.2%
3 20238
 
4.9%
7 16894
 
4.1%
5 14797
 
3.6%
6 13450
 
3.3%
4 11845
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
Common 409124
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 125652
30.7%
9 95339
23.3%
8 44654
 
10.9%
0 40689
 
9.9%
2 25566
 
6.2%
3 20238
 
4.9%
7 16894
 
4.1%
5 14797
 
3.6%
6 13450
 
3.3%
4 11845
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 409124
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 125652
30.7%
9 95339
23.3%
8 44654
 
10.9%
0 40689
 
9.9%
2 25566
 
6.2%
3 20238
 
4.9%
7 16894
 
4.1%
5 14797
 
3.6%
6 13450
 
3.3%
4 11845
 
2.9%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing93636
Missing (%)50.2%
Memory size1.4 MiB
2025-01-08T18:32:24.887153image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.091094054
Min length1

Characters and Unicode

Total characters101355
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10
2nd row5
3rd row7
4th row9
5th row6
ValueCountFrequency (%)
8 18299
19.7%
7 17323
18.6%
6 14919
16.1%
9 12921
13.9%
5 10473
11.3%
10 5121
 
5.5%
4 4638
 
5.0%
3 2657
 
2.9%
11 2066
 
2.2%
2 1834
 
2.0%
Other values (2) 2642
 
2.8%
2025-01-08T18:32:24.989118image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 18299
18.1%
7 17323
17.1%
6 14919
14.7%
9 12921
12.7%
1 11895
11.7%
5 10473
10.3%
0 5121
 
5.1%
4 4638
 
4.6%
2 3109
 
3.1%
3 2657
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101355
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 18299
18.1%
7 17323
17.1%
6 14919
14.7%
9 12921
12.7%
1 11895
11.7%
5 10473
10.3%
0 5121
 
5.1%
4 4638
 
4.6%
2 3109
 
3.1%
3 2657
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Common 101355
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 18299
18.1%
7 17323
17.1%
6 14919
14.7%
9 12921
12.7%
1 11895
11.7%
5 10473
10.3%
0 5121
 
5.1%
4 4638
 
4.6%
2 3109
 
3.1%
3 2657
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101355
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 18299
18.1%
7 17323
17.1%
6 14919
14.7%
9 12921
12.7%
1 11895
11.7%
5 10473
10.3%
0 5121
 
5.1%
4 4638
 
4.6%
2 3109
 
3.1%
3 2657
 
2.6%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing104750
Missing (%)56.2%
Memory size1.4 MiB
2025-01-08T18:32:25.056432image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.717934922
Min length1

Characters and Unicode

Total characters140491
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row27
3rd row23
4th row2
5th row24
ValueCountFrequency (%)
20 3304
 
4.0%
12 3096
 
3.8%
30 3014
 
3.7%
10 2927
 
3.6%
19 2906
 
3.6%
15 2903
 
3.5%
17 2855
 
3.5%
13 2816
 
3.4%
8 2799
 
3.4%
4 2741
 
3.4%
Other values (21) 52418
64.1%
2025-01-08T18:32:25.180177image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 37526
26.7%
2 33938
24.2%
3 12087
 
8.6%
0 9245
 
6.6%
5 8118
 
5.8%
4 8099
 
5.8%
7 8096
 
5.8%
8 7960
 
5.7%
9 7762
 
5.5%
6 7660
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 140491
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 37526
26.7%
2 33938
24.2%
3 12087
 
8.6%
0 9245
 
6.6%
5 8118
 
5.8%
4 8099
 
5.8%
7 8096
 
5.8%
8 7960
 
5.7%
9 7762
 
5.5%
6 7660
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Common 140491
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 37526
26.7%
2 33938
24.2%
3 12087
 
8.6%
0 9245
 
6.6%
5 8118
 
5.8%
4 8099
 
5.8%
7 8096
 
5.8%
8 7960
 
5.7%
9 7762
 
5.5%
6 7660
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 140491
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 37526
26.7%
2 33938
24.2%
3 12087
 
8.6%
0 9245
 
6.6%
5 8118
 
5.8%
4 8099
 
5.8%
7 8096
 
5.8%
8 7960
 
5.7%
9 7762
 
5.5%
6 7660
 
5.5%

habitat
Text

Missing 

Distinct14351
Distinct (%)49.8%
Missing157729
Missing (%)84.6%
Memory size1.4 MiB
2025-01-08T18:32:25.362080image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length242
Median length188
Mean length21.31243056
Min length3

Characters and Unicode

Total characters613798
Distinct characters96
Distinct categories13 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11617 ?
Unique (%)40.3%

Sample

1st rowEarth
2nd rowEdge of lake; Moist soil
3rd rowOn high cliff
4th rowPrimary montane forest.
5th rowSur les arbres (on the trees)
ValueCountFrequency (%)
on 15468
 
13.4%
in 6301
 
5.5%
of 4480
 
3.9%
rocks 4154
 
3.6%
a 1945
 
1.7%
woods 1920
 
1.7%
wet 1736
 
1.5%
trees 1636
 
1.4%
and 1457
 
1.3%
tree 1397
 
1.2%
Other values (4244) 74614
64.8%
2025-01-08T18:32:25.626666image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
86308
14.1%
e 49629
 
8.1%
o 47740
 
7.8%
n 47247
 
7.7%
a 38799
 
6.3%
s 38696
 
6.3%
r 36877
 
6.0%
t 27255
 
4.4%
i 24024
 
3.9%
d 23439
 
3.8%
Other values (86) 193784
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 476716
77.7%
Space Separator 86308
 
14.1%
Uppercase Letter 35456
 
5.8%
Other Punctuation 12410
 
2.0%
Dash Punctuation 868
 
0.1%
Close Punctuation 824
 
0.1%
Open Punctuation 816
 
0.1%
Decimal Number 335
 
0.1%
Math Symbol 43
 
< 0.1%
Currency Symbol 11
 
< 0.1%
Other values (3) 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 49629
10.4%
o 47740
 
10.0%
n 47247
 
9.9%
a 38799
 
8.1%
s 38696
 
8.1%
r 36877
 
7.7%
t 27255
 
5.7%
i 24024
 
5.0%
d 23439
 
4.9%
l 20507
 
4.3%
Other values (17) 122503
25.7%
Uppercase Letter
ValueCountFrequency (%)
O 14615
41.2%
I 2748
 
7.8%
S 2363
 
6.7%
B 2146
 
6.1%
R 1640
 
4.6%
A 1512
 
4.3%
C 1306
 
3.7%
W 1225
 
3.5%
M 1196
 
3.4%
D 910
 
2.6%
Other values (17) 5795
 
16.3%
Other Punctuation
ValueCountFrequency (%)
. 5273
42.5%
, 3827
30.8%
; 2610
21.0%
/ 316
 
2.5%
& 127
 
1.0%
? 91
 
0.7%
" 73
 
0.6%
' 54
 
0.4%
: 34
 
0.3%
¡ 3
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 67
20.0%
2 55
16.4%
0 55
16.4%
3 52
15.5%
4 32
9.6%
6 20
 
6.0%
5 20
 
6.0%
9 15
 
4.5%
8 13
 
3.9%
7 6
 
1.8%
Open Punctuation
ValueCountFrequency (%)
( 796
97.5%
10
 
1.2%
[ 9
 
1.1%
{ 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
+ 38
88.4%
= 4
 
9.3%
< 1
 
2.3%
Currency Symbol
ValueCountFrequency (%)
¤ 6
54.5%
¢ 3
27.3%
£ 2
 
18.2%
Initial Punctuation
ValueCountFrequency (%)
3
50.0%
2
33.3%
1
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 866
99.8%
2
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 814
98.8%
] 10
 
1.2%
Final Punctuation
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Space Separator
ValueCountFrequency (%)
86308
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 512172
83.4%
Common 101626
 
16.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 49629
 
9.7%
o 47740
 
9.3%
n 47247
 
9.2%
a 38799
 
7.6%
s 38696
 
7.6%
r 36877
 
7.2%
t 27255
 
5.3%
i 24024
 
4.7%
d 23439
 
4.6%
l 20507
 
4.0%
Other values (44) 157959
30.8%
Common
ValueCountFrequency (%)
86308
84.9%
. 5273
 
5.2%
, 3827
 
3.8%
; 2610
 
2.6%
- 866
 
0.9%
) 814
 
0.8%
( 796
 
0.8%
/ 316
 
0.3%
& 127
 
0.1%
? 91
 
0.1%
Other values (32) 598
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 613755
> 99.9%
Punctuation 22
 
< 0.1%
None 20
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
86308
14.1%
e 49629
 
8.1%
o 47740
 
7.8%
n 47247
 
7.7%
a 38799
 
6.3%
s 38696
 
6.3%
r 36877
 
6.0%
t 27255
 
4.4%
i 24024
 
3.9%
d 23439
 
3.8%
Other values (72) 193741
31.6%
Punctuation
ValueCountFrequency (%)
10
45.5%
3
 
13.6%
3
 
13.6%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
None
ValueCountFrequency (%)
¤ 6
30.0%
Š 3
15.0%
¡ 3
15.0%
ø 3
15.0%
¢ 3
15.0%
£ 2
 
10.0%
Modifier Letters
ValueCountFrequency (%)
ˆ 1
100.0%

higherGeography
Text

Missing 

Distinct3946
Distinct (%)3.4%
Missing72099
Missing (%)38.7%
Memory size1.4 MiB
2025-01-08T18:32:25.823350image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length100
Median length90
Mean length51.15013545
Min length4

Characters and Unicode

Total characters5853110
Distinct characters68
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1598 ?
Unique (%)1.4%

Sample

1st rowNorth America; USA; Connecticut; New London County; Salem
2nd rowNorth America; USA; Connecticut; New Haven County; New Haven
3rd rowNorth America; Canada; British Columbia
4th rowNorth America; USA; Connecticut; Litchfield County; Washington
5th rowNorth America; USA; Connecticut; Hartford County; Southington
ValueCountFrequency (%)
north 111995
14.5%
america 109184
14.1%
usa 99054
12.8%
county 86823
11.2%
connecticut 62098
 
8.0%
new 41413
 
5.3%
haven 29950
 
3.9%
hartford 12411
 
1.6%
litchfield 10261
 
1.3%
fairfield 7167
 
0.9%
Other values (2947) 204171
26.4%
2025-01-08T18:32:26.083846image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
660097
 
11.3%
t 419182
 
7.2%
o 390984
 
6.7%
; 389920
 
6.7%
n 368644
 
6.3%
e 355954
 
6.1%
r 341889
 
5.8%
a 317910
 
5.4%
i 314775
 
5.4%
c 274503
 
4.7%
Other values (58) 2019252
34.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3828543
65.4%
Uppercase Letter 972588
 
16.6%
Space Separator 660097
 
11.3%
Other Punctuation 390992
 
6.7%
Dash Punctuation 882
 
< 0.1%
Open Punctuation 4
 
< 0.1%
Close Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 419182
10.9%
o 390984
10.2%
n 368644
9.6%
e 355954
9.3%
r 341889
8.9%
a 317910
8.3%
i 314775
8.2%
c 274503
 
7.2%
u 191407
 
5.0%
h 158690
 
4.1%
Other values (22) 694605
18.1%
Uppercase Letter
ValueCountFrequency (%)
A 217147
22.3%
C 174483
17.9%
N 158609
16.3%
S 122261
12.6%
U 100685
10.4%
H 51363
 
5.3%
M 24299
 
2.5%
L 22479
 
2.3%
W 14997
 
1.5%
F 14090
 
1.4%
Other values (16) 72175
 
7.4%
Other Punctuation
ValueCountFrequency (%)
; 389920
99.7%
' 560
 
0.1%
. 316
 
0.1%
& 196
 
0.1%
Open Punctuation
ValueCountFrequency (%)
[ 2
50.0%
( 2
50.0%
Close Punctuation
ValueCountFrequency (%)
] 2
50.0%
) 2
50.0%
Space Separator
ValueCountFrequency (%)
660097
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 882
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4801131
82.0%
Common 1051979
 
18.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 419182
 
8.7%
o 390984
 
8.1%
n 368644
 
7.7%
e 355954
 
7.4%
r 341889
 
7.1%
a 317910
 
6.6%
i 314775
 
6.6%
c 274503
 
5.7%
A 217147
 
4.5%
u 191407
 
4.0%
Other values (48) 1608736
33.5%
Common
ValueCountFrequency (%)
660097
62.7%
; 389920
37.1%
- 882
 
0.1%
' 560
 
0.1%
. 316
 
< 0.1%
& 196
 
< 0.1%
[ 2
 
< 0.1%
] 2
 
< 0.1%
( 2
 
< 0.1%
) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5852731
> 99.9%
None 379
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
660097
 
11.3%
t 419182
 
7.2%
o 390984
 
6.7%
; 389920
 
6.7%
n 368644
 
6.3%
e 355954
 
6.1%
r 341889
 
5.8%
a 317910
 
5.4%
i 314775
 
5.4%
c 274503
 
4.7%
Other values (52) 2018873
34.5%
None
ValueCountFrequency (%)
á 110
29.0%
í 98
25.9%
ü 97
25.6%
é 36
 
9.5%
ó 36
 
9.5%
ç 2
 
0.5%

continent
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing73143
Missing (%)39.2%
Memory size1.4 MiB
2025-01-08T18:32:26.141678image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.76341876
Min length4

Characters and Unicode

Total characters1447193
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 108995
96.1%
europe 1671
 
1.5%
asia 1008
 
0.9%
south_america 749
 
0.7%
oceania 665
 
0.6%
africa 293
 
0.3%
antarctica 5
 
< 0.1%
2025-01-08T18:32:26.237906image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 223435
15.4%
R 220708
15.3%
E 113751
7.9%
O 112080
7.7%
I 111715
7.7%
C 110712
7.7%
T 109754
7.6%
H 109744
7.6%
_ 109744
7.6%
M 109744
7.6%
Other values (5) 115806
8.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1337449
92.4%
Connector Punctuation 109744
 
7.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 223435
16.7%
R 220708
16.5%
E 113751
8.5%
O 112080
8.4%
I 111715
8.4%
C 110712
8.3%
T 109754
8.2%
H 109744
8.2%
M 109744
8.2%
N 109665
8.2%
Other values (4) 6141
 
0.5%
Connector Punctuation
ValueCountFrequency (%)
_ 109744
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1337449
92.4%
Common 109744
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 223435
16.7%
R 220708
16.5%
E 113751
8.5%
O 112080
8.4%
I 111715
8.4%
C 110712
8.3%
T 109754
8.2%
H 109744
8.2%
M 109744
8.2%
N 109665
8.2%
Other values (4) 6141
 
0.5%
Common
ValueCountFrequency (%)
_ 109744
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1447193
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 223435
15.4%
R 220708
15.3%
E 113751
7.9%
O 112080
7.7%
I 111715
7.7%
C 110712
7.7%
T 109754
7.6%
H 109744
7.6%
_ 109744
7.6%
M 109744
7.6%
Other values (5) 115806
8.0%

waterBody
Text

Missing 

Distinct18
Distinct (%)0.6%
Missing183495
Missing (%)98.4%
Memory size1.4 MiB
2025-01-08T18:32:26.293907image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length33
Median length32
Mean length22.4996704
Min length12

Characters and Unicode

Total characters68264
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.2%

Sample

1st rowAtlantic Ocean; Caribbean Sea
2nd rowAtlantic Ocean; Sargasso Sea
3rd rowAtlantic Ocean
4th rowAtlantic Ocean; Caribbean Sea
5th rowAtlantic Ocean; Adriatic Sea
ValueCountFrequency (%)
ocean 3034
30.6%
atlantic 2509
25.3%
sea 1009
 
10.2%
caribbean 673
 
6.8%
long 503
 
5.1%
island 503
 
5.1%
sound 503
 
5.1%
pacific 450
 
4.5%
adriatic 126
 
1.3%
sargasso 123
 
1.2%
Other values (15) 478
 
4.8%
2025-01-08T18:32:26.403013image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 9455
13.9%
n 8029
11.8%
6877
10.1%
c 6672
9.8%
t 5226
 
7.7%
e 5055
 
7.4%
i 4589
 
6.7%
l 3120
 
4.6%
O 3036
 
4.4%
A 2636
 
3.9%
Other values (26) 13569
19.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 49965
73.2%
Uppercase Letter 9805
 
14.4%
Space Separator 6877
 
10.1%
Other Punctuation 1617
 
2.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9455
18.9%
n 8029
16.1%
c 6672
13.4%
t 5226
10.5%
e 5055
10.1%
i 4589
9.2%
l 3120
 
6.2%
b 1347
 
2.7%
o 1339
 
2.7%
d 1289
 
2.6%
Other values (11) 3844
7.7%
Uppercase Letter
ValueCountFrequency (%)
O 3036
31.0%
A 2636
26.9%
S 1637
16.7%
C 675
 
6.9%
I 577
 
5.9%
L 503
 
5.1%
P 451
 
4.6%
M 176
 
1.8%
G 105
 
1.1%
R 6
 
0.1%
Other values (3) 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
6877
100.0%
Other Punctuation
ValueCountFrequency (%)
; 1617
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 59770
87.6%
Common 8494
 
12.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9455
15.8%
n 8029
13.4%
c 6672
11.2%
t 5226
8.7%
e 5055
8.5%
i 4589
7.7%
l 3120
 
5.2%
O 3036
 
5.1%
A 2636
 
4.4%
S 1637
 
2.7%
Other values (24) 10315
17.3%
Common
ValueCountFrequency (%)
6877
81.0%
; 1617
 
19.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 68264
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 9455
13.9%
n 8029
11.8%
6877
10.1%
c 6672
9.8%
t 5226
 
7.7%
e 5055
 
7.4%
i 4589
 
6.7%
l 3120
 
4.6%
O 3036
 
4.4%
A 2636
 
3.9%
Other values (26) 13569
19.9%

countryCode
Text

Missing 

Distinct107
Distinct (%)0.1%
Missing72482
Missing (%)38.9%
Memory size1.4 MiB
2025-01-08T18:32:26.491830image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters228094
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)< 0.1%

Sample

1st rowUS
2nd rowUS
3rd rowCA
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 98166
86.1%
ca 6370
 
5.6%
mx 1398
 
1.2%
cu 1385
 
1.2%
pr 883
 
0.8%
cn 726
 
0.6%
gb 643
 
0.6%
au 497
 
0.4%
bm 438
 
0.4%
fr 405
 
0.4%
Other values (97) 3136
 
2.7%
2025-01-08T18:32:26.631478image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 100202
43.9%
S 98426
43.2%
C 8932
 
3.9%
A 7095
 
3.1%
M 2280
 
1.0%
R 1558
 
0.7%
B 1489
 
0.7%
X 1404
 
0.6%
P 1321
 
0.6%
N 982
 
0.4%
Other values (16) 4405
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 228094
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 100202
43.9%
S 98426
43.2%
C 8932
 
3.9%
A 7095
 
3.1%
M 2280
 
1.0%
R 1558
 
0.7%
B 1489
 
0.7%
X 1404
 
0.6%
P 1321
 
0.6%
N 982
 
0.4%
Other values (16) 4405
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 228094
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 100202
43.9%
S 98426
43.2%
C 8932
 
3.9%
A 7095
 
3.1%
M 2280
 
1.0%
R 1558
 
0.7%
B 1489
 
0.7%
X 1404
 
0.6%
P 1321
 
0.6%
N 982
 
0.4%
Other values (16) 4405
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 228094
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 100202
43.9%
S 98426
43.2%
C 8932
 
3.9%
A 7095
 
3.1%
M 2280
 
1.0%
R 1558
 
0.7%
B 1489
 
0.7%
X 1404
 
0.6%
P 1321
 
0.6%
N 982
 
0.4%
Other values (16) 4405
 
1.9%

stateProvince
Text

Missing 

Distinct228
Distinct (%)0.2%
Missing78016
Missing (%)41.8%
Memory size1.4 MiB
2025-01-08T18:32:26.801452image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length11
Mean length10.38558514
Min length4

Characters and Unicode

Total characters1126971
Distinct characters58
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)< 0.1%

Sample

1st rowConnecticut
2nd rowConnecticut
3rd rowBritish Columbia
4th rowConnecticut
5th rowConnecticut
ValueCountFrequency (%)
connecticut 62098
50.1%
new 5448
 
4.4%
california 3651
 
2.9%
michigan 3126
 
2.5%
florida 2732
 
2.2%
hampshire 2664
 
2.1%
massachusetts 2337
 
1.9%
maine 2073
 
1.7%
columbia 2034
 
1.6%
british 1905
 
1.5%
Other values (255) 35967
29.0%
2025-01-08T18:32:27.033748image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 153675
13.6%
t 143885
12.8%
c 135676
12.0%
i 108030
9.6%
o 93850
8.3%
e 90883
8.1%
u 72414
6.4%
C 69792
 
6.2%
a 54867
 
4.9%
r 26005
 
2.3%
Other values (48) 177894
15.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 987704
87.6%
Uppercase Letter 123543
 
11.0%
Space Separator 15522
 
1.4%
Dash Punctuation 202
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 153675
15.6%
t 143885
14.6%
c 135676
13.7%
i 108030
10.9%
o 93850
9.5%
e 90883
9.2%
u 72414
7.3%
a 54867
 
5.6%
r 26005
 
2.6%
s 23900
 
2.4%
Other values (21) 84519
8.6%
Uppercase Letter
ValueCountFrequency (%)
C 69792
56.5%
M 9865
 
8.0%
N 8790
 
7.1%
H 4662
 
3.8%
S 3193
 
2.6%
F 2734
 
2.2%
W 2621
 
2.1%
V 2372
 
1.9%
B 2286
 
1.9%
P 2093
 
1.7%
Other values (15) 15135
 
12.3%
Space Separator
ValueCountFrequency (%)
15522
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 202
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1111247
98.6%
Common 15724
 
1.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 153675
13.8%
t 143885
12.9%
c 135676
12.2%
i 108030
9.7%
o 93850
8.4%
e 90883
8.2%
u 72414
6.5%
C 69792
6.3%
a 54867
 
4.9%
r 26005
 
2.3%
Other values (46) 162170
14.6%
Common
ValueCountFrequency (%)
15522
98.7%
- 202
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1126602
> 99.9%
None 369
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 153675
13.6%
t 143885
12.8%
c 135676
12.0%
i 108030
9.6%
o 93850
8.3%
e 90883
8.1%
u 72414
6.4%
C 69792
 
6.2%
a 54867
 
4.9%
r 26005
 
2.3%
Other values (43) 177525
15.8%
None
ValueCountFrequency (%)
á 107
29.0%
í 98
26.6%
ü 97
26.3%
ó 34
 
9.2%
é 33
 
8.9%

county
Text

Missing 

Distinct881
Distinct (%)1.0%
Missing98586
Missing (%)52.9%
Memory size1.4 MiB
2025-01-08T18:32:27.209805image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length31
Mean length15.47382964
Min length4

Characters and Unicode

Total characters1360815
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique234 ?
Unique (%)0.3%

Sample

1st rowNew London County
2nd rowNew Haven County
3rd rowLitchfield County
4th rowHartford County
5th rowLitchfield County
ValueCountFrequency (%)
county 86823
42.1%
new 27711
 
13.4%
haven 21492
 
10.4%
hartford 10602
 
5.1%
litchfield 8892
 
4.3%
fairfield 6414
 
3.1%
london 6205
 
3.0%
middlesex 4458
 
2.2%
windham 2098
 
1.0%
tolland 1927
 
0.9%
Other values (919) 29493
 
14.3%
2025-01-08T18:32:27.453223image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 142574
 
10.5%
o 128105
 
9.4%
118172
 
8.7%
t 116308
 
8.5%
u 93310
 
6.9%
C 90545
 
6.7%
e 90254
 
6.6%
y 88473
 
6.5%
a 63460
 
4.7%
d 48213
 
3.5%
Other values (49) 381401
28.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1035147
76.1%
Uppercase Letter 206666
 
15.2%
Space Separator 118172
 
8.7%
Dash Punctuation 444
 
< 0.1%
Other Punctuation 384
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 142574
13.8%
o 128105
12.4%
t 116308
11.2%
u 93310
9.0%
e 90254
8.7%
y 88473
8.5%
a 63460
 
6.1%
d 48213
 
4.7%
i 47877
 
4.6%
r 39508
 
3.8%
Other values (17) 177065
17.1%
Uppercase Letter
ValueCountFrequency (%)
C 90545
43.8%
H 33277
 
16.1%
N 28182
 
13.6%
L 16295
 
7.9%
M 7866
 
3.8%
F 7523
 
3.6%
S 4016
 
1.9%
W 3238
 
1.6%
T 2375
 
1.1%
B 2345
 
1.1%
Other values (16) 11004
 
5.3%
Other Punctuation
ValueCountFrequency (%)
. 223
58.1%
' 161
41.9%
Space Separator
ValueCountFrequency (%)
118172
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 444
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1241813
91.3%
Common 119002
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 142574
11.5%
o 128105
 
10.3%
t 116308
 
9.4%
u 93310
 
7.5%
C 90545
 
7.3%
e 90254
 
7.3%
y 88473
 
7.1%
a 63460
 
5.1%
d 48213
 
3.9%
i 47877
 
3.9%
Other values (43) 332694
26.8%
Common
ValueCountFrequency (%)
118172
99.3%
- 444
 
0.4%
. 223
 
0.2%
' 161
 
0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1360813
> 99.9%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 142574
 
10.5%
o 128105
 
9.4%
118172
 
8.7%
t 116308
 
8.5%
u 93310
 
6.9%
C 90545
 
6.7%
e 90254
 
6.6%
y 88473
 
6.5%
a 63460
 
4.7%
d 48213
 
3.5%
Other values (48) 381399
28.0%
None
ValueCountFrequency (%)
ó 2
100.0%

municipality
Text

Missing 

Distinct2118
Distinct (%)2.8%
Missing110052
Missing (%)59.0%
Memory size1.4 MiB
2025-01-08T18:32:27.646642image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length39
Median length30
Mean length8.966486656
Min length3

Characters and Unicode

Total characters685730
Distinct characters62
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique841 ?
Unique (%)1.1%

Sample

1st rowSalem
2nd rowNew Haven
3rd rowWashington
4th rowSouthington
5th rowCornwall
ValueCountFrequency (%)
haven 8458
 
8.7%
new 8105
 
8.3%
southington 2857
 
2.9%
north 2691
 
2.8%
east 2584
 
2.7%
guilford 2301
 
2.4%
salisbury 1956
 
2.0%
lyme 1877
 
1.9%
branford 1824
 
1.9%
hartford 1809
 
1.9%
Other values (2013) 62977
64.6%
2025-01-08T18:32:27.918076image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 53847
 
7.9%
e 53791
 
7.8%
n 53538
 
7.8%
r 52706
 
7.7%
a 51213
 
7.5%
t 42856
 
6.2%
i 37612
 
5.5%
l 34531
 
5.0%
d 28780
 
4.2%
20962
 
3.1%
Other values (52) 255894
37.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 566373
82.6%
Uppercase Letter 97469
 
14.2%
Space Separator 20962
 
3.1%
Other Punctuation 688
 
0.1%
Dash Punctuation 236
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 53847
 
9.5%
e 53791
 
9.5%
n 53538
 
9.5%
r 52706
 
9.3%
a 51213
 
9.0%
t 42856
 
7.6%
i 37612
 
6.6%
l 34531
 
6.1%
d 28780
 
5.1%
s 19429
 
3.4%
Other values (19) 138070
24.4%
Uppercase Letter
ValueCountFrequency (%)
S 13453
13.8%
H 13413
13.8%
N 12984
13.3%
W 9137
9.4%
B 6850
 
7.0%
G 5994
 
6.1%
C 4838
 
5.0%
M 4772
 
4.9%
L 4537
 
4.7%
E 3457
 
3.5%
Other values (16) 18034
18.5%
Other Punctuation
ValueCountFrequency (%)
' 399
58.0%
& 196
28.5%
. 93
 
13.5%
Space Separator
ValueCountFrequency (%)
20962
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 236
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 663842
96.8%
Common 21888
 
3.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 53847
 
8.1%
e 53791
 
8.1%
n 53538
 
8.1%
r 52706
 
7.9%
a 51213
 
7.7%
t 42856
 
6.5%
i 37612
 
5.7%
l 34531
 
5.2%
d 28780
 
4.3%
s 19429
 
2.9%
Other values (45) 235539
35.5%
Common
ValueCountFrequency (%)
20962
95.8%
' 399
 
1.8%
- 236
 
1.1%
& 196
 
0.9%
. 93
 
0.4%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 685722
> 99.9%
None 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 53847
 
7.9%
e 53791
 
7.8%
n 53538
 
7.8%
r 52706
 
7.7%
a 51213
 
7.5%
t 42856
 
6.2%
i 37612
 
5.5%
l 34531
 
5.0%
d 28780
 
4.2%
20962
 
3.1%
Other values (49) 255886
37.3%
None
ValueCountFrequency (%)
á 3
37.5%
é 3
37.5%
ç 2
25.0%

locality
Text

Missing 

Distinct21413
Distinct (%)35.0%
Missing125307
Missing (%)67.2%
Memory size1.4 MiB
2025-01-08T18:32:28.111251image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length351
Median length190
Mean length26.85044592
Min length3

Characters and Unicode

Total characters1643838
Distinct characters94
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14163 ?
Unique (%)23.1%

Sample

1st rownear Scotch Creek, Shushwap Lake
2nd rowSouth Shuttle Street
3rd rowCalumet Island, Timbalier Bay
4th rowRio Blanco
5th rowOak Hill
ValueCountFrequency (%)
of 13616
 
5.1%
near 6444
 
2.4%
island 5988
 
2.3%
river 4368
 
1.6%
lake 4069
 
1.5%
and 3691
 
1.4%
road 3087
 
1.2%
yale 3008
 
1.1%
mountains 2923
 
1.1%
west 2905
 
1.1%
Other values (11704) 214841
81.1%
2025-01-08T18:32:28.369460image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
203718
 
12.4%
a 141357
 
8.6%
e 131463
 
8.0%
o 116914
 
7.1%
n 109604
 
6.7%
r 89713
 
5.5%
t 83094
 
5.1%
i 75103
 
4.6%
l 66840
 
4.1%
s 66454
 
4.0%
Other values (84) 559578
34.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1197147
72.8%
Space Separator 203718
 
12.4%
Uppercase Letter 186670
 
11.4%
Other Punctuation 33882
 
2.1%
Decimal Number 13304
 
0.8%
Close Punctuation 3613
 
0.2%
Open Punctuation 3582
 
0.2%
Dash Punctuation 1450
 
0.1%
Other Symbol 423
 
< 0.1%
Math Symbol 49
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 141357
11.8%
e 131463
11.0%
o 116914
9.8%
n 109604
9.2%
r 89713
 
7.5%
t 83094
 
6.9%
i 75103
 
6.3%
l 66840
 
5.6%
s 66454
 
5.6%
d 50253
 
4.2%
Other values (24) 266352
22.2%
Uppercase Letter
ValueCountFrequency (%)
M 17688
 
9.5%
S 17239
 
9.2%
R 16853
 
9.0%
C 14862
 
8.0%
P 14723
 
7.9%
B 14281
 
7.7%
L 12462
 
6.7%
H 8806
 
4.7%
N 8375
 
4.5%
I 8098
 
4.3%
Other values (17) 53283
28.5%
Other Punctuation
ValueCountFrequency (%)
, 22303
65.8%
. 6163
 
18.2%
' 3859
 
11.4%
/ 345
 
1.0%
" 313
 
0.9%
; 267
 
0.8%
: 241
 
0.7%
? 164
 
0.5%
& 134
 
0.4%
# 92
 
0.3%
Decimal Number
ValueCountFrequency (%)
1 2497
18.8%
2 1786
13.4%
0 1562
11.7%
3 1451
10.9%
5 1340
10.1%
4 1323
9.9%
7 900
 
6.8%
9 872
 
6.6%
6 818
 
6.1%
8 755
 
5.7%
Close Punctuation
ValueCountFrequency (%)
] 3215
89.0%
) 397
 
11.0%
} 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 3212
89.7%
( 369
 
10.3%
{ 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 42
85.7%
+ 6
 
12.2%
> 1
 
2.0%
Space Separator
ValueCountFrequency (%)
203718
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1450
100.0%
Other Symbol
ValueCountFrequency (%)
° 423
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1383817
84.2%
Common 260021
 
15.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 141357
 
10.2%
e 131463
 
9.5%
o 116914
 
8.4%
n 109604
 
7.9%
r 89713
 
6.5%
t 83094
 
6.0%
i 75103
 
5.4%
l 66840
 
4.8%
s 66454
 
4.8%
d 50253
 
3.6%
Other values (51) 453022
32.7%
Common
ValueCountFrequency (%)
203718
78.3%
, 22303
 
8.6%
. 6163
 
2.4%
' 3859
 
1.5%
] 3215
 
1.2%
[ 3212
 
1.2%
1 2497
 
1.0%
2 1786
 
0.7%
0 1562
 
0.6%
3 1451
 
0.6%
Other values (23) 10255
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1643370
> 99.9%
None 468
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
203718
 
12.4%
a 141357
 
8.6%
e 131463
 
8.0%
o 116914
 
7.1%
n 109604
 
6.7%
r 89713
 
5.5%
t 83094
 
5.1%
i 75103
 
4.6%
l 66840
 
4.1%
s 66454
 
4.0%
Other values (74) 559110
34.0%
None
ValueCountFrequency (%)
° 423
90.4%
é 14
 
3.0%
á 9
 
1.9%
í 6
 
1.3%
à 6
 
1.3%
Î 4
 
0.9%
ú 2
 
0.4%
ñ 2
 
0.4%
ã 1
 
0.2%
ä 1
 
0.2%

verbatimElevation
Text

Missing 

Distinct884
Distinct (%)11.6%
Missing178933
Missing (%)95.9%
Memory size1.4 MiB
2025-01-08T18:32:28.546377image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length13
Mean length5.866508689
Min length3

Characters and Unicode

Total characters44562
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique369 ?
Unique (%)4.9%

Sample

1st row564 m
2nd row1450-1550 m
3rd row1012 m
4th row137 m
5th row1463 m
ValueCountFrequency (%)
m 7482
49.2%
1524 267
 
1.8%
305 236
 
1.6%
1219 190
 
1.3%
1829 179
 
1.2%
366 170
 
1.1%
914 167
 
1.1%
610 162
 
1.1%
2743 153
 
1.0%
244 150
 
1.0%
Other values (875) 6036
39.7%
2025-01-08T18:32:28.786988image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7596
17.0%
m 7482
16.8%
0 5082
11.4%
1 4893
11.0%
2 3978
8.9%
3 2614
 
5.9%
5 2551
 
5.7%
4 2434
 
5.5%
6 1983
 
4.4%
8 1718
 
3.9%
Other values (5) 4231
9.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 28520
64.0%
Lowercase Letter 7710
 
17.3%
Space Separator 7596
 
17.0%
Dash Punctuation 736
 
1.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5082
17.8%
1 4893
17.2%
2 3978
13.9%
3 2614
9.2%
5 2551
8.9%
4 2434
8.5%
6 1983
 
7.0%
8 1718
 
6.0%
7 1718
 
6.0%
9 1549
 
5.4%
Lowercase Letter
ValueCountFrequency (%)
m 7482
97.0%
f 114
 
1.5%
t 114
 
1.5%
Space Separator
ValueCountFrequency (%)
7596
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 736
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 36852
82.7%
Latin 7710
 
17.3%

Most frequent character per script

Common
ValueCountFrequency (%)
7596
20.6%
0 5082
13.8%
1 4893
13.3%
2 3978
10.8%
3 2614
 
7.1%
5 2551
 
6.9%
4 2434
 
6.6%
6 1983
 
5.4%
8 1718
 
4.7%
7 1718
 
4.7%
Other values (2) 2285
 
6.2%
Latin
ValueCountFrequency (%)
m 7482
97.0%
f 114
 
1.5%
t 114
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 44562
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7596
17.0%
m 7482
16.8%
0 5082
11.4%
1 4893
11.0%
2 3978
8.9%
3 2614
 
5.9%
5 2551
 
5.7%
4 2434
 
5.5%
6 1983
 
4.4%
8 1718
 
3.9%
Other values (5) 4231
9.5%

decimalLatitude
Text

Missing 

Distinct8319
Distinct (%)8.0%
Missing82100
Missing (%)44.0%
Memory size1.4 MiB
2025-01-08T18:32:28.998126image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length7.598693849
Min length3

Characters and Unicode

Total characters793524
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4029 ?
Unique (%)3.9%

Sample

1st row41.4854
2nd row41.407
3rd row51.0
4th row41.6523
5th row41.605
ValueCountFrequency (%)
41.407 2004
 
1.9%
41.305111 1951
 
1.9%
41.3114 1870
 
1.8%
41.605 1661
 
1.6%
41.5583 1312
 
1.3%
41.6049 1164
 
1.1%
41.986 1069
 
1.0%
46.166667 1017
 
1.0%
41.6153 994
 
1.0%
41.7413 947
 
0.9%
Other values (8306) 90440
86.6%
2025-01-08T18:32:29.267741image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 139283
17.6%
1 119799
15.1%
. 104429
13.2%
3 70791
8.9%
6 63059
7.9%
9 54886
 
6.9%
7 53117
 
6.7%
5 52064
 
6.6%
2 50360
 
6.3%
8 44495
 
5.6%
Other values (2) 41241
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 688290
86.7%
Other Punctuation 104429
 
13.2%
Dash Punctuation 805
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 139283
20.2%
1 119799
17.4%
3 70791
10.3%
6 63059
9.2%
9 54886
 
8.0%
7 53117
 
7.7%
5 52064
 
7.6%
2 50360
 
7.3%
8 44495
 
6.5%
0 40436
 
5.9%
Other Punctuation
ValueCountFrequency (%)
. 104429
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 805
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 793524
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 139283
17.6%
1 119799
15.1%
. 104429
13.2%
3 70791
8.9%
6 63059
7.9%
9 54886
 
6.9%
7 53117
 
6.7%
5 52064
 
6.6%
2 50360
 
6.3%
8 44495
 
5.6%
Other values (2) 41241
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 793524
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 139283
17.6%
1 119799
15.1%
. 104429
13.2%
3 70791
8.9%
6 63059
7.9%
9 54886
 
6.9%
7 53117
 
6.7%
5 52064
 
6.6%
2 50360
 
6.3%
8 44495
 
5.6%
Other values (2) 41241
 
5.2%

decimalLongitude
Text

Missing 

Distinct8315
Distinct (%)8.0%
Missing82100
Missing (%)44.0%
Memory size1.4 MiB
2025-01-08T18:32:29.467353image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length11
Mean length8.608298461
Min length3

Characters and Unicode

Total characters898956
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4015 ?
Unique (%)3.8%

Sample

1st row-72.2664
2nd row-72.9316
3rd row-119.0
4th row-73.3145
5th row-72.88
ValueCountFrequency (%)
72.88 2825
 
2.7%
72.9316 1988
 
1.9%
72.920823 1951
 
1.9%
72.9247 1870
 
1.8%
73.1931 1368
 
1.3%
73.036 1211
 
1.2%
72.8575 1086
 
1.0%
73.4257 1069
 
1.0%
60.75 1048
 
1.0%
72.4831 902
 
0.9%
Other values (8302) 89111
85.3%
2025-01-08T18:32:29.721725image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 128553
14.3%
. 104429
11.6%
- 102523
11.4%
2 98658
11.0%
3 84490
9.4%
1 72475
8.1%
8 63406
7.1%
6 53825
6.0%
9 52391
5.8%
5 47500
 
5.3%
Other values (2) 90706
10.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 692004
77.0%
Other Punctuation 104429
 
11.6%
Dash Punctuation 102523
 
11.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 128553
18.6%
2 98658
14.3%
3 84490
12.2%
1 72475
10.5%
8 63406
9.2%
6 53825
7.8%
9 52391
7.6%
5 47500
 
6.9%
4 46836
 
6.8%
0 43870
 
6.3%
Other Punctuation
ValueCountFrequency (%)
. 104429
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 102523
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 898956
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 128553
14.3%
. 104429
11.6%
- 102523
11.4%
2 98658
11.0%
3 84490
9.4%
1 72475
8.1%
8 63406
7.1%
6 53825
6.0%
9 52391
5.8%
5 47500
 
5.3%
Other values (2) 90706
10.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 898956
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 128553
14.3%
. 104429
11.6%
- 102523
11.4%
2 98658
11.0%
3 84490
9.4%
1 72475
8.1%
8 63406
7.1%
6 53825
6.0%
9 52391
5.8%
5 47500
 
5.3%
Other values (2) 90706
10.1%
Distinct5428
Distinct (%)5.2%
Missing82138
Missing (%)44.0%
Memory size1.4 MiB
2025-01-08T18:32:29.926445image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length6
Mean length6.201770268
Min length3

Characters and Unicode

Total characters647409
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2086 ?
Unique (%)2.0%

Sample

1st row7093.0
2nd row7710.0
3rd row1189.0
4th row7762.0
5th row7725.0
ValueCountFrequency (%)
7725.0 2825
 
2.7%
1851.0 2328
 
2.2%
7710.0 1992
 
1.9%
6384.0 1951
 
1.9%
7484.0 1870
 
1.8%
9878.0 1817
 
1.7%
5062.0 1804
 
1.7%
11151.0 1368
 
1.3%
6630.0 1312
 
1.3%
7184.0 1083
 
1.0%
Other values (5418) 86041
82.4%
2025-01-08T18:32:30.186716image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 152670
23.6%
. 104391
16.1%
1 58864
 
9.1%
7 50281
 
7.8%
5 47230
 
7.3%
8 43792
 
6.8%
6 41665
 
6.4%
4 40758
 
6.3%
3 37978
 
5.9%
2 35411
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 543018
83.9%
Other Punctuation 104391
 
16.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 152670
28.1%
1 58864
 
10.8%
7 50281
 
9.3%
5 47230
 
8.7%
8 43792
 
8.1%
6 41665
 
7.7%
4 40758
 
7.5%
3 37978
 
7.0%
2 35411
 
6.5%
9 34369
 
6.3%
Other Punctuation
ValueCountFrequency (%)
. 104391
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 647409
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 152670
23.6%
. 104391
16.1%
1 58864
 
9.1%
7 50281
 
7.8%
5 47230
 
7.3%
8 43792
 
6.8%
6 41665
 
6.4%
4 40758
 
6.3%
3 37978
 
5.9%
2 35411
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 647409
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 152670
23.6%
. 104391
16.1%
1 58864
 
9.1%
7 50281
 
7.8%
5 47230
 
7.3%
8 43792
 
6.8%
6 41665
 
6.4%
4 40758
 
6.3%
3 37978
 
5.9%
2 35411
 
5.5%

georeferencedBy
Text

Missing 

Distinct6
Distinct (%)0.1%
Missing182211
Missing (%)97.7%
Memory size1.4 MiB
2025-01-08T18:32:30.259458image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length16
Mean length16.97522001
Min length13

Characters and Unicode

Total characters73299
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)0.1%

Sample

1st rowAngus J. Mossman
2nd rowAngus J. Mossman
3rd rowAngus J. Mossman
4th rowAngus J. Mossman
5th rowAngus J. Mossman
ValueCountFrequency (%)
angus 2204
17.0%
j 2204
17.0%
mossman 2204
17.0%
patrick 2110
16.3%
w 2110
16.3%
sweeney 2110
16.3%
lynn 1
 
< 0.1%
a 1
 
< 0.1%
jones 1
 
< 0.1%
jesse 1
 
< 0.1%
Other values (6) 6
 
< 0.1%
2025-01-08T18:32:30.371694image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8634
 
11.8%
s 6616
 
9.0%
n 6522
 
8.9%
e 6336
 
8.6%
a 4318
 
5.9%
. 4316
 
5.9%
J 2206
 
3.0%
A 2205
 
3.0%
o 2205
 
3.0%
g 2204
 
3.0%
Other values (21) 27737
37.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 47397
64.7%
Uppercase Letter 12952
 
17.7%
Space Separator 8634
 
11.8%
Other Punctuation 4316
 
5.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 6616
14.0%
n 6522
13.8%
e 6336
13.4%
a 4318
9.1%
o 2205
 
4.7%
g 2204
 
4.7%
u 2204
 
4.7%
m 2204
 
4.7%
r 2114
 
4.5%
w 2112
 
4.5%
Other values (8) 10562
22.3%
Uppercase Letter
ValueCountFrequency (%)
J 2206
17.0%
A 2205
17.0%
M 2204
17.0%
W 2110
16.3%
S 2110
16.3%
P 2110
16.3%
L 2
 
< 0.1%
E 2
 
< 0.1%
N 1
 
< 0.1%
F 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
8634
100.0%
Other Punctuation
ValueCountFrequency (%)
. 4316
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 60349
82.3%
Common 12950
 
17.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 6616
 
11.0%
n 6522
 
10.8%
e 6336
 
10.5%
a 4318
 
7.2%
J 2206
 
3.7%
A 2205
 
3.7%
o 2205
 
3.7%
g 2204
 
3.7%
u 2204
 
3.7%
M 2204
 
3.7%
Other values (19) 23329
38.7%
Common
ValueCountFrequency (%)
8634
66.7%
. 4316
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73299
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8634
 
11.8%
s 6616
 
9.0%
n 6522
 
8.9%
e 6336
 
8.6%
a 4318
 
5.9%
. 4316
 
5.9%
J 2206
 
3.0%
A 2205
 
3.0%
o 2205
 
3.0%
g 2204
 
3.0%
Other values (21) 27737
37.8%

georeferencedDate
Text

Missing 

Distinct43
Distinct (%)0.4%
Missing174887
Missing (%)93.8%
Memory size1.4 MiB
2025-01-08T18:32:30.437445image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length8.869266449
Min length4

Characters and Unicode

Total characters103256
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.1%

Sample

1st row2016-11-04
2nd row2023-06-13
3rd row2015
4th row2016-11-04
5th row2023-08-24
ValueCountFrequency (%)
2015 2193
18.8%
2016-11-04 1996
17.1%
2023-08-24 1867
16.0%
2016-06-23 1595
13.7%
2023-06-13 1462
12.6%
2024-05-18 1141
9.8%
2024-01-17 616
 
5.3%
2023-08-13 395
 
3.4%
2016-10-31 121
 
1.0%
2016-10-28 51
 
0.4%
Other values (33) 205
 
1.8%
2025-01-08T18:32:30.556100image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 21119
20.5%
2 20850
20.2%
- 18896
18.3%
1 14751
14.3%
3 7402
 
7.2%
6 6972
 
6.8%
4 5713
 
5.5%
8 3515
 
3.4%
5 3345
 
3.2%
7 673
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 84360
81.7%
Dash Punctuation 18896
 
18.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 21119
25.0%
2 20850
24.7%
1 14751
17.5%
3 7402
 
8.8%
6 6972
 
8.3%
4 5713
 
6.8%
8 3515
 
4.2%
5 3345
 
4.0%
7 673
 
0.8%
9 20
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 18896
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 103256
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 21119
20.5%
2 20850
20.2%
- 18896
18.3%
1 14751
14.3%
3 7402
 
7.2%
6 6972
 
6.8%
4 5713
 
5.5%
8 3515
 
3.4%
5 3345
 
3.2%
7 673
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 103256
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 21119
20.5%
2 20850
20.2%
- 18896
18.3%
1 14751
14.3%
3 7402
 
7.2%
6 6972
 
6.8%
4 5713
 
5.5%
8 3515
 
3.4%
5 3345
 
3.2%
7 673
 
0.7%

georeferenceProtocol
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing82331
Missing (%)44.1%
Memory size1.4 MiB
2025-01-08T18:32:30.606641image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length17
Mean length16.43626557
Min length11

Characters and Unicode

Total characters1712626
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowphysical resource
2nd rowdigital resource
3rd rowdigital resource
4th rowphysical resource
5th rowphysical resource
ValueCountFrequency (%)
resource 102675
49.6%
physical 53073
25.7%
digital 49602
24.0%
unspecified 1523
 
0.7%
2025-01-08T18:32:30.706365image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 208396
12.2%
r 205350
12.0%
s 157271
9.2%
c 157271
9.2%
i 155323
9.1%
u 104198
 
6.1%
a 102675
 
6.0%
l 102675
 
6.0%
102675
 
6.0%
o 102675
 
6.0%
Other values (8) 314117
18.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1609951
94.0%
Space Separator 102675
 
6.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 208396
12.9%
r 205350
12.8%
s 157271
9.8%
c 157271
9.8%
i 155323
9.6%
u 104198
6.5%
a 102675
 
6.4%
l 102675
 
6.4%
o 102675
 
6.4%
p 54596
 
3.4%
Other values (7) 259521
16.1%
Space Separator
ValueCountFrequency (%)
102675
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1609951
94.0%
Common 102675
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 208396
12.9%
r 205350
12.8%
s 157271
9.8%
c 157271
9.8%
i 155323
9.6%
u 104198
6.5%
a 102675
 
6.4%
l 102675
 
6.4%
o 102675
 
6.4%
p 54596
 
3.4%
Other values (7) 259521
16.1%
Common
ValueCountFrequency (%)
102675
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1712626
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 208396
12.2%
r 205350
12.0%
s 157271
9.2%
c 157271
9.2%
i 155323
9.1%
u 104198
 
6.1%
a 102675
 
6.0%
l 102675
 
6.0%
102675
 
6.0%
o 102675
 
6.0%
Other values (8) 314117
18.3%

georeferenceSources
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing83888
Missing (%)45.0%
Memory size1.4 MiB
2025-01-08T18:32:30.763914image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length15
Mean length14.92256506
Min length4

Characters and Unicode

Total characters1531667
Distinct characters40
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowtopographic map
2nd rowGEOLocate
3rd rowGEOLocate
4th rowtopographic map
5th rowtopographic map
ValueCountFrequency (%)
topographic 53027
25.2%
map 53027
25.2%
geolocate 31271
14.9%
usa 13341
 
6.3%
state 13210
 
6.3%
digital 13210
 
6.3%
data 13210
 
6.3%
resource 13210
 
6.3%
vertnet 1811
 
0.9%
unspecified 1524
 
0.7%
Other values (12) 3467
 
1.6%
2025-01-08T18:32:30.878756image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 190363
12.4%
p 160611
10.5%
o 150929
 
9.9%
t 142161
 
9.3%
107667
 
7.0%
c 99042
 
6.5%
i 83723
 
5.5%
e 77911
 
5.1%
r 68240
 
4.5%
g 66434
 
4.3%
Other values (30) 384586
25.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1196078
78.1%
Uppercase Letter 227398
 
14.8%
Space Separator 107667
 
7.0%
Decimal Number 524
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 190363
15.9%
p 160611
13.4%
o 150929
12.6%
t 142161
11.9%
c 99042
8.3%
i 83723
7.0%
e 77911
6.5%
r 68240
 
5.7%
g 66434
 
5.6%
h 53210
 
4.4%
Other values (8) 103454
8.6%
Uppercase Letter
ValueCountFrequency (%)
G 32808
14.4%
E 31836
14.0%
O 31271
13.8%
L 31271
13.8%
S 27769
12.2%
D 26420
11.6%
R 13341
5.9%
U 13341
5.9%
A 13341
5.9%
V 2062
 
0.9%
Other values (7) 3938
 
1.7%
Decimal Number
ValueCountFrequency (%)
1 131
25.0%
4 131
25.0%
0 131
25.0%
2 131
25.0%
Space Separator
ValueCountFrequency (%)
107667
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1423476
92.9%
Common 108191
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 190363
13.4%
p 160611
11.3%
o 150929
10.6%
t 142161
10.0%
c 99042
 
7.0%
i 83723
 
5.9%
e 77911
 
5.5%
r 68240
 
4.8%
g 66434
 
4.7%
h 53210
 
3.7%
Other values (25) 330852
23.2%
Common
ValueCountFrequency (%)
107667
99.5%
1 131
 
0.1%
4 131
 
0.1%
0 131
 
0.1%
2 131
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1531667
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 190363
12.4%
p 160611
10.5%
o 150929
 
9.9%
t 142161
 
9.3%
107667
 
7.0%
c 99042
 
6.5%
i 83723
 
5.5%
e 77911
 
5.1%
r 68240
 
4.5%
g 66434
 
4.3%
Other values (30) 384586
25.1%

georeferenceRemarks
Text

Missing 

Distinct6514
Distinct (%)6.4%
Missing85474
Missing (%)45.8%
Memory size1.4 MiB
2025-01-08T18:32:31.012119image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length465
Median length15
Mean length63.16058582
Min length2

Characters and Unicode

Total characters6382693
Distinct characters85
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3825 ?
Unique (%)3.8%

Sample

1st rowfrom CT DEP Map
2nd rowex Argus
3rd rowjlsanesdoc (2015-07-14 13:16:00); Geolocated to Shuswap Lake
4th rowfrom CT DEP Map
5th rowfrom CT DEP Map
ValueCountFrequency (%)
the 81840
 
8.6%
from 60217
 
6.3%
ct 56024
 
5.9%
map 53164
 
5.6%
dep 53055
 
5.6%
of 33115
 
3.5%
centroid 28035
 
3.0%
polygon 28033
 
3.0%
uncertainty 18145
 
1.9%
database 17580
 
1.9%
Other values (8113) 520344
54.8%
2025-01-08T18:32:31.236522image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
848525
 
13.3%
e 492230
 
7.7%
o 402931
 
6.3%
t 360014
 
5.6%
a 343662
 
5.4%
r 306928
 
4.8%
n 285528
 
4.5%
i 227311
 
3.6%
s 197953
 
3.1%
d 177049
 
2.8%
Other values (75) 2740562
42.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4013075
62.9%
Space Separator 848525
 
13.3%
Uppercase Letter 636507
 
10.0%
Decimal Number 495160
 
7.8%
Other Punctuation 236193
 
3.7%
Dash Punctuation 59535
 
0.9%
Open Punctuation 42944
 
0.7%
Close Punctuation 42942
 
0.7%
Connector Punctuation 7578
 
0.1%
Math Symbol 233
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 492230
12.3%
o 402931
 
10.0%
t 360014
 
9.0%
a 343662
 
8.6%
r 306928
 
7.6%
n 285528
 
7.1%
i 227311
 
5.7%
s 197953
 
4.9%
d 177049
 
4.4%
l 138359
 
3.4%
Other values (17) 1081110
26.9%
Uppercase Letter
ValueCountFrequency (%)
C 87444
13.7%
T 81216
12.8%
D 77234
12.1%
M 76938
12.1%
P 67525
10.6%
E 62274
9.8%
A 34574
 
5.4%
G 34244
 
5.4%
N 23701
 
3.7%
S 14621
 
2.3%
Other values (16) 76736
12.1%
Other Punctuation
ValueCountFrequency (%)
. 81586
34.5%
: 52775
22.3%
, 50428
21.4%
/ 28633
 
12.1%
; 21546
 
9.1%
& 772
 
0.3%
' 213
 
0.1%
" 120
 
0.1%
? 88
 
< 0.1%
% 27
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 116785
23.6%
2 103300
20.9%
1 87747
17.7%
5 44518
 
9.0%
4 35937
 
7.3%
6 29500
 
6.0%
3 24561
 
5.0%
8 24163
 
4.9%
7 14864
 
3.0%
9 13785
 
2.8%
Math Symbol
ValueCountFrequency (%)
= 177
76.0%
+ 55
 
23.6%
~ 1
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 42943
> 99.9%
[ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 42941
> 99.9%
] 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
848525
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 59535
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7578
100.0%
Other Symbol
ValueCountFrequency (%)
¦ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4649582
72.8%
Common 1733111
 
27.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 492230
 
10.6%
o 402931
 
8.7%
t 360014
 
7.7%
a 343662
 
7.4%
r 306928
 
6.6%
n 285528
 
6.1%
i 227311
 
4.9%
s 197953
 
4.3%
d 177049
 
3.8%
l 138359
 
3.0%
Other values (43) 1717617
36.9%
Common
ValueCountFrequency (%)
848525
49.0%
0 116785
 
6.7%
2 103300
 
6.0%
1 87747
 
5.1%
. 81586
 
4.7%
- 59535
 
3.4%
: 52775
 
3.0%
, 50428
 
2.9%
5 44518
 
2.6%
( 42943
 
2.5%
Other values (22) 244969
 
14.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6382686
> 99.9%
None 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
848525
 
13.3%
e 492230
 
7.7%
o 402931
 
6.3%
t 360014
 
5.6%
a 343662
 
5.4%
r 306928
 
4.8%
n 285528
 
4.5%
i 227311
 
3.6%
s 197953
 
3.1%
d 177049
 
2.8%
Other values (73) 2740555
42.9%
None
ValueCountFrequency (%)
ÿ 6
85.7%
¦ 1
 
14.3%

typeStatus
Text

Missing 

Distinct12
Distinct (%)0.3%
Missing182608
Missing (%)97.9%
Memory size1.4 MiB
2025-01-08T18:32:31.298526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length7
Mean length7.223922469
Min length4

Characters and Unicode

Total characters28325
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowISOTYPE
2nd rowISOSYNTYPE
3rd rowISOTYPE
4th rowISOTYPE
5th rowISOTYPE
ValueCountFrequency (%)
isotype 2414
61.6%
syntype 851
 
21.7%
isolectotype 201
 
5.1%
type 197
 
5.0%
isosyntype 103
 
2.6%
holotype 90
 
2.3%
paratype 19
 
0.5%
lectotype 17
 
0.4%
cotype 16
 
0.4%
isoneotype 6
 
0.2%
Other values (2) 7
 
0.2%
2025-01-08T18:32:31.529541image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 4875
17.2%
T 4145
14.6%
E 4145
14.6%
P 3947
13.9%
S 3679
13.0%
O 3157
11.1%
I 2725
9.6%
N 960
 
3.4%
L 308
 
1.1%
C 234
 
0.8%
Other values (3) 150
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 28325
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 4875
17.2%
T 4145
14.6%
E 4145
14.6%
P 3947
13.9%
S 3679
13.0%
O 3157
11.1%
I 2725
9.6%
N 960
 
3.4%
L 308
 
1.1%
C 234
 
0.8%
Other values (3) 150
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 28325
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 4875
17.2%
T 4145
14.6%
E 4145
14.6%
P 3947
13.9%
S 3679
13.0%
O 3157
11.1%
I 2725
9.6%
N 960
 
3.4%
L 308
 
1.1%
C 234
 
0.8%
Other values (3) 150
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28325
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 4875
17.2%
T 4145
14.6%
E 4145
14.6%
P 3947
13.9%
S 3679
13.0%
O 3157
11.1%
I 2725
9.6%
N 960
 
3.4%
L 308
 
1.1%
C 234
 
0.8%
Other values (3) 150
 
0.5%

identifiedBy
Text

Missing 

Distinct193
Distinct (%)3.2%
Missing180415
Missing (%)96.7%
Memory size1.4 MiB
2025-01-08T18:32:31.712088image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length20
Mean length16.31534184
Min length5

Characters and Unicode

Total characters99752
Distinct characters54
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)0.9%

Sample

1st rowMartin C. Van Boskirk
2nd rowAlexander W. Evans
3rd rowMason E. Hale
4th rowAlexander W. Evans
5th rowM. H. Lewis
ValueCountFrequency (%)
w 1055
 
5.9%
alexander 744
 
4.2%
evans 744
 
4.2%
george 644
 
3.6%
f 634
 
3.6%
j 597
 
3.4%
c 484
 
2.7%
k 480
 
2.7%
h 421
 
2.4%
carl 419
 
2.4%
Other values (324) 11577
65.0%
2025-01-08T18:32:31.982081image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11685
 
11.7%
e 8430
 
8.5%
r 7926
 
7.9%
a 6479
 
6.5%
n 5716
 
5.7%
l 5553
 
5.6%
. 5495
 
5.5%
o 5002
 
5.0%
i 4237
 
4.2%
s 3571
 
3.6%
Other values (44) 35658
35.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 64705
64.9%
Uppercase Letter 17846
 
17.9%
Space Separator 11685
 
11.7%
Other Punctuation 5495
 
5.5%
Dash Punctuation 21
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 8430
13.0%
r 7926
12.2%
a 6479
10.0%
n 5716
8.8%
l 5553
8.6%
o 5002
7.7%
i 4237
 
6.5%
s 3571
 
5.5%
t 3188
 
4.9%
d 2155
 
3.3%
Other values (18) 12448
19.2%
Uppercase Letter
ValueCountFrequency (%)
W 1964
11.0%
A 1863
10.4%
M 1766
9.9%
C 1598
 
9.0%
E 1480
 
8.3%
G 1140
 
6.4%
B 1130
 
6.3%
J 997
 
5.6%
H 913
 
5.1%
F 862
 
4.8%
Other values (13) 4133
23.2%
Space Separator
ValueCountFrequency (%)
11685
100.0%
Other Punctuation
ValueCountFrequency (%)
. 5495
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 82551
82.8%
Common 17201
 
17.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 8430
 
10.2%
r 7926
 
9.6%
a 6479
 
7.8%
n 5716
 
6.9%
l 5553
 
6.7%
o 5002
 
6.1%
i 4237
 
5.1%
s 3571
 
4.3%
t 3188
 
3.9%
d 2155
 
2.6%
Other values (41) 30294
36.7%
Common
ValueCountFrequency (%)
11685
67.9%
. 5495
31.9%
- 21
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 99635
99.9%
None 117
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11685
 
11.7%
e 8430
 
8.5%
r 7926
 
8.0%
a 6479
 
6.5%
n 5716
 
5.7%
l 5553
 
5.6%
. 5495
 
5.5%
o 5002
 
5.0%
i 4237
 
4.3%
s 3571
 
3.6%
Other values (42) 35541
35.7%
None
ValueCountFrequency (%)
á 105
89.7%
é 12
 
10.3%

dateIdentified
Text

Missing 

Distinct85
Distinct (%)4.4%
Missing184582
Missing (%)99.0%
Memory size1.4 MiB
2025-01-08T18:32:32.076926image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters36993
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)1.0%

Sample

1st row1997-01-01T00:00:00
2nd row1946-01-01T00:00:00
3rd row1946-01-01T00:00:00
4th row1946-01-01T00:00:00
5th row1995-01-01T00:00:00
ValueCountFrequency (%)
1995-01-01t00:00:00 414
21.3%
1997-01-01t00:00:00 349
17.9%
1984-01-01t00:00:00 135
 
6.9%
1954-01-01t00:00:00 102
 
5.2%
1956-01-01t00:00:00 80
 
4.1%
1946-01-01t00:00:00 63
 
3.2%
1962-01-01t00:00:00 61
 
3.1%
1953-01-01t00:00:00 60
 
3.1%
1957-01-01t00:00:00 54
 
2.8%
1979-01-01t00:00:00 52
 
2.7%
Other values (75) 577
29.6%
2025-01-08T18:32:32.217033image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 15888
42.9%
1 5780
 
15.6%
- 3894
 
10.5%
: 3894
 
10.5%
9 2681
 
7.2%
T 1947
 
5.3%
5 908
 
2.5%
7 588
 
1.6%
8 337
 
0.9%
2 334
 
0.9%
Other values (3) 742
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 27258
73.7%
Dash Punctuation 3894
 
10.5%
Other Punctuation 3894
 
10.5%
Uppercase Letter 1947
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 15888
58.3%
1 5780
 
21.2%
9 2681
 
9.8%
5 908
 
3.3%
7 588
 
2.2%
8 337
 
1.2%
2 334
 
1.2%
4 333
 
1.2%
6 285
 
1.0%
3 124
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
- 3894
100.0%
Other Punctuation
ValueCountFrequency (%)
: 3894
100.0%
Uppercase Letter
ValueCountFrequency (%)
T 1947
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 35046
94.7%
Latin 1947
 
5.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 15888
45.3%
1 5780
 
16.5%
- 3894
 
11.1%
: 3894
 
11.1%
9 2681
 
7.6%
5 908
 
2.6%
7 588
 
1.7%
8 337
 
1.0%
2 334
 
1.0%
4 333
 
1.0%
Other values (2) 409
 
1.2%
Latin
ValueCountFrequency (%)
T 1947
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36993
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 15888
42.9%
1 5780
 
15.6%
- 3894
 
10.5%
: 3894
 
10.5%
9 2681
 
7.2%
T 1947
 
5.3%
5 908
 
2.5%
7 588
 
1.6%
8 337
 
0.9%
2 334
 
0.9%
Other values (3) 742
 
2.0%

identificationRemarks
Text

Missing 

Distinct2949
Distinct (%)79.8%
Missing182833
Missing (%)98.0%
Memory size1.4 MiB
2025-01-08T18:32:32.394748image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length283
Median length167
Mean length48.17316017
Min length9

Characters and Unicode

Total characters178048
Distinct characters88
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2406 ?
Unique (%)65.1%

Sample

1st rowProc. Amer. Acad. Arts. 22: 420. 1887.
2nd rowMem. Amer. Acad. Arts. n.s. 520. 1862.
3rd rowPl. Wright. (Grisebach) 1: 173. 1860.
4th rowProc. Amer. Acad. 22: 428. 1887.
5th rowProceedings of the American Academy of Arts and Sciences. 7: 381. 1868.
ValueCountFrequency (%)
of 1913
 
6.4%
the 1010
 
3.4%
arts 820
 
2.8%
acad 670
 
2.3%
amer 663
 
2.2%
american 639
 
2.1%
and 627
 
2.1%
academy 614
 
2.1%
sciences 570
 
1.9%
proc 563
 
1.9%
Other values (2176) 21659
72.8%
2025-01-08T18:32:32.649483image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26052
 
14.6%
. 14090
 
7.9%
e 10440
 
5.9%
a 8607
 
4.8%
1 7541
 
4.2%
o 7231
 
4.1%
r 7176
 
4.0%
n 6647
 
3.7%
t 6588
 
3.7%
i 6274
 
3.5%
Other values (78) 77402
43.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 87279
49.0%
Decimal Number 30944
 
17.4%
Space Separator 26052
 
14.6%
Other Punctuation 17631
 
9.9%
Uppercase Letter 14635
 
8.2%
Dash Punctuation 555
 
0.3%
Close Punctuation 470
 
0.3%
Open Punctuation 470
 
0.3%
Math Symbol 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 10440
12.0%
a 8607
9.9%
o 7231
 
8.3%
r 7176
 
8.2%
n 6647
 
7.6%
t 6588
 
7.5%
i 6274
 
7.2%
c 5785
 
6.6%
s 4551
 
5.2%
l 4141
 
4.7%
Other values (23) 19839
22.7%
Uppercase Letter
ValueCountFrequency (%)
A 3827
26.1%
P 2135
14.6%
S 1494
 
10.2%
C 1489
 
10.2%
B 1022
 
7.0%
G 628
 
4.3%
N 580
 
4.0%
F 472
 
3.2%
M 426
 
2.9%
R 325
 
2.2%
Other values (16) 2237
15.3%
Other Punctuation
ValueCountFrequency (%)
. 14090
79.9%
: 2970
 
16.8%
, 359
 
2.0%
; 90
 
0.5%
' 75
 
0.4%
& 29
 
0.2%
" 13
 
0.1%
# 2
 
< 0.1%
/ 2
 
< 0.1%
? 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 7541
24.4%
8 5251
17.0%
6 3239
10.5%
2 2901
 
9.4%
7 2254
 
7.3%
9 2082
 
6.7%
3 2071
 
6.7%
4 1961
 
6.3%
5 1948
 
6.3%
0 1696
 
5.5%
Dash Punctuation
ValueCountFrequency (%)
- 530
95.5%
25
 
4.5%
Close Punctuation
ValueCountFrequency (%)
) 311
66.2%
] 159
33.8%
Open Punctuation
ValueCountFrequency (%)
( 310
66.0%
[ 160
34.0%
Math Symbol
ValueCountFrequency (%)
= 10
83.3%
+ 2
 
16.7%
Space Separator
ValueCountFrequency (%)
26052
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 101914
57.2%
Common 76134
42.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 10440
 
10.2%
a 8607
 
8.4%
o 7231
 
7.1%
r 7176
 
7.0%
n 6647
 
6.5%
t 6588
 
6.5%
i 6274
 
6.2%
c 5785
 
5.7%
s 4551
 
4.5%
l 4141
 
4.1%
Other values (49) 34474
33.8%
Common
ValueCountFrequency (%)
26052
34.2%
. 14090
18.5%
1 7541
 
9.9%
8 5251
 
6.9%
6 3239
 
4.3%
: 2970
 
3.9%
2 2901
 
3.8%
7 2254
 
3.0%
9 2082
 
2.7%
3 2071
 
2.7%
Other values (19) 7683
 
10.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 177975
> 99.9%
None 48
 
< 0.1%
Punctuation 25
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
26052
 
14.6%
. 14090
 
7.9%
e 10440
 
5.9%
a 8607
 
4.8%
1 7541
 
4.2%
o 7231
 
4.1%
r 7176
 
4.0%
n 6647
 
3.7%
t 6588
 
3.7%
i 6274
 
3.5%
Other values (70) 77329
43.4%
None
ValueCountFrequency (%)
ü 25
52.1%
é 9
 
18.8%
ö 8
 
16.7%
è 2
 
4.2%
ä 2
 
4.2%
ë 1
 
2.1%
ñ 1
 
2.1%
Punctuation
ValueCountFrequency (%)
25
100.0%
Distinct13242
Distinct (%)7.1%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-08T18:32:32.857718image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.097645715
Min length1

Characters and Unicode

Total characters1137278
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5385 ?
Unique (%)2.9%

Sample

1st row2700991
2nd row3170096
3rd row2728060
4th row4276910
5th row6
ValueCountFrequency (%)
6 28377
 
15.2%
2721893 1395
 
0.7%
2651126 1343
 
0.7%
2650111 1163
 
0.6%
3196548 1155
 
0.6%
2650583 1063
 
0.6%
2933951 736
 
0.4%
2651736 535
 
0.3%
2650888 527
 
0.3%
2689220 495
 
0.3%
Other values (13232) 149722
80.3%
2025-01-08T18:32:33.126418image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 182570
16.1%
6 143478
12.6%
7 116217
10.2%
8 112612
9.9%
5 110350
9.7%
3 109965
9.7%
1 107204
9.4%
0 94336
8.3%
9 88487
7.8%
4 72059
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1137278
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 182570
16.1%
6 143478
12.6%
7 116217
10.2%
8 112612
9.9%
5 110350
9.7%
3 109965
9.7%
1 107204
9.4%
0 94336
8.3%
9 88487
7.8%
4 72059
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
Common 1137278
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 182570
16.1%
6 143478
12.6%
7 116217
10.2%
8 112612
9.9%
5 110350
9.7%
3 109965
9.7%
1 107204
9.4%
0 94336
8.3%
9 88487
7.8%
4 72059
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1137278
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 182570
16.1%
6 143478
12.6%
7 116217
10.2%
8 112612
9.9%
5 110350
9.7%
3 109965
9.7%
1 107204
9.4%
0 94336
8.3%
9 88487
7.8%
4 72059
 
6.3%
Distinct15722
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:33.324593image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length87
Median length68
Mean length24.48391939
Min length5

Characters and Unicode

Total characters4566961
Distinct characters96
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6979 ?
Unique (%)3.7%

Sample

1st rowLuzula bulbosa (Alph.Wood) Smyth & L.C.R.Smyth
2nd rowGentiana clausa Raf.
3rd rowCarex muhlenbergii Kunth ex Boott
4th rowLophocolea minor Nees
5th rowPlantae
ValueCountFrequency (%)
l 52365
 
9.0%
plantae 28377
 
4.9%
ex 10744
 
1.8%
carex 8803
 
1.5%
7927
 
1.4%
willd 5079
 
0.9%
hedw 4951
 
0.8%
michx 4881
 
0.8%
dumort 4636
 
0.8%
var 3379
 
0.6%
Other values (13172) 451450
77.5%
2025-01-08T18:32:33.587985image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 436599
 
9.6%
396063
 
8.7%
i 310153
 
6.8%
e 285903
 
6.3%
l 247925
 
5.4%
r 240173
 
5.3%
n 217491
 
4.8%
. 202035
 
4.4%
o 199856
 
4.4%
u 192016
 
4.2%
Other values (86) 1838747
40.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3367734
73.7%
Uppercase Letter 460197
 
10.1%
Space Separator 396063
 
8.7%
Other Punctuation 215275
 
4.7%
Close Punctuation 54697
 
1.2%
Open Punctuation 54697
 
1.2%
Decimal Number 16824
 
0.4%
Dash Punctuation 894
 
< 0.1%
Math Symbol 576
 
< 0.1%
Connector Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 436599
13.0%
i 310153
 
9.2%
e 285903
 
8.5%
l 247925
 
7.4%
r 240173
 
7.1%
n 217491
 
6.5%
o 199856
 
5.9%
u 192016
 
5.7%
t 187781
 
5.6%
s 187105
 
5.6%
Other values (35) 862732
25.6%
Uppercase Letter
ValueCountFrequency (%)
L 77852
16.9%
P 56667
12.3%
S 42264
 
9.2%
C 37080
 
8.1%
A 33196
 
7.2%
M 24632
 
5.4%
B 22344
 
4.9%
H 22082
 
4.8%
D 19711
 
4.3%
R 17098
 
3.7%
Other values (21) 107271
23.3%
Decimal Number
ValueCountFrequency (%)
1 4792
28.5%
8 4211
25.0%
2 1967
11.7%
0 1687
 
10.0%
3 998
 
5.9%
4 948
 
5.6%
9 681
 
4.0%
7 548
 
3.3%
5 545
 
3.2%
6 447
 
2.7%
Other Punctuation
ValueCountFrequency (%)
. 202035
93.8%
& 7927
 
3.7%
, 5244
 
2.4%
' 69
 
< 0.1%
Space Separator
ValueCountFrequency (%)
396063
100.0%
Close Punctuation
ValueCountFrequency (%)
) 54697
100.0%
Open Punctuation
ValueCountFrequency (%)
( 54697
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 894
100.0%
Math Symbol
ValueCountFrequency (%)
× 576
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3827931
83.8%
Common 739030
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 436599
 
11.4%
i 310153
 
8.1%
e 285903
 
7.5%
l 247925
 
6.5%
r 240173
 
6.3%
n 217491
 
5.7%
o 199856
 
5.2%
u 192016
 
5.0%
t 187781
 
4.9%
s 187105
 
4.9%
Other values (66) 1322929
34.6%
Common
ValueCountFrequency (%)
396063
53.6%
. 202035
27.3%
) 54697
 
7.4%
( 54697
 
7.4%
& 7927
 
1.1%
, 5244
 
0.7%
1 4792
 
0.6%
8 4211
 
0.6%
2 1967
 
0.3%
0 1687
 
0.2%
Other values (10) 5710
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4560557
99.9%
None 6404
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 436599
 
9.6%
396063
 
8.7%
i 310153
 
6.8%
e 285903
 
6.3%
l 247925
 
5.4%
r 240173
 
5.3%
n 217491
 
4.8%
. 202035
 
4.4%
o 199856
 
4.4%
u 192016
 
4.2%
Other values (61) 1832343
40.2%
None
ValueCountFrequency (%)
ü 2543
39.7%
ö 899
 
14.0%
ä 609
 
9.5%
é 590
 
9.2%
× 576
 
9.0%
ø 425
 
6.6%
Á 262
 
4.1%
Å 245
 
3.8%
è 134
 
2.1%
á 44
 
0.7%
Other values (15) 77
 
1.2%
Distinct792
Distinct (%)0.4%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-08T18:32:33.760264image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length119
Median length91
Mean length47.98150243
Min length5

Characters and Unicode

Total characters8949078
Distinct characters53
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique93 ?
Unique (%)< 0.1%

Sample

1st rowPlantae; Tracheophyta; Poales; Juncaceae
2nd rowPlantae; Tracheophyta; Asteridae; Gentianales; Gentianaceae
3rd rowPlantae; Tracheophyta; Poales; Cyperaceae
4th rowPlantae; Bryophyta; Hepaticopsida; Jungermanniales; Lophocoleaceae
5th rowPlantae
ValueCountFrequency (%)
plantae 177514
22.7%
tracheophyta 104057
 
13.3%
bryophyta 37100
 
4.7%
poales 23133
 
3.0%
hepaticopsida 21780
 
2.8%
asteridae 20956
 
2.7%
rosidae 18590
 
2.4%
jungermanniales 16103
 
2.1%
polypodiales 14202
 
1.8%
cyperaceae 13776
 
1.8%
Other values (1036) 334890
42.8%
2025-01-08T18:32:34.009197image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1419556
15.9%
e 1079293
 
12.1%
; 595590
 
6.7%
595590
 
6.7%
t 468738
 
5.2%
l 453031
 
5.1%
o 447517
 
5.0%
c 388641
 
4.3%
r 357029
 
4.0%
h 342137
 
3.8%
Other values (43) 2801956
31.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6975797
77.9%
Uppercase Letter 782101
 
8.7%
Other Punctuation 595590
 
6.7%
Space Separator 595590
 
6.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1419556
20.3%
e 1079293
15.5%
t 468738
 
6.7%
l 453031
 
6.5%
o 447517
 
6.4%
c 388641
 
5.6%
r 357029
 
5.1%
h 342137
 
4.9%
i 338965
 
4.9%
n 337540
 
4.8%
Other values (16) 1343350
19.3%
Uppercase Letter
ValueCountFrequency (%)
P 253748
32.4%
T 107375
13.7%
A 69797
 
8.9%
B 65121
 
8.3%
C 46187
 
5.9%
R 42144
 
5.4%
F 32277
 
4.1%
H 30368
 
3.9%
L 25762
 
3.3%
J 22652
 
2.9%
Other values (15) 86670
 
11.1%
Other Punctuation
ValueCountFrequency (%)
; 595590
100.0%
Space Separator
ValueCountFrequency (%)
595590
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7757898
86.7%
Common 1191180
 
13.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1419556
18.3%
e 1079293
13.9%
t 468738
 
6.0%
l 453031
 
5.8%
o 447517
 
5.8%
c 388641
 
5.0%
r 357029
 
4.6%
h 342137
 
4.4%
i 338965
 
4.4%
n 337540
 
4.4%
Other values (41) 2125451
27.4%
Common
ValueCountFrequency (%)
; 595590
50.0%
595590
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8949078
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1419556
15.9%
e 1079293
 
12.1%
; 595590
 
6.7%
595590
 
6.7%
t 468738
 
5.2%
l 453031
 
5.1%
o 447517
 
5.0%
c 388641
 
4.3%
r 357029
 
4.0%
h 342137
 
3.8%
Other values (43) 2801956
31.3%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:34.069511image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length7
Mean length6.981981354
Min length5

Characters and Unicode

Total characters1302342
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPlantae
2nd rowPlantae
3rd rowPlantae
4th rowPlantae
5th rowPlantae
ValueCountFrequency (%)
plantae 177496
95.1%
fungi 5161
 
2.8%
chromista 2981
 
1.6%
bacteria 869
 
0.5%
incertae 18
 
< 0.1%
sedis 18
 
< 0.1%
animalia 2
 
< 0.1%
protozoa 2
 
< 0.1%
2025-01-08T18:32:34.172899image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 359735
27.6%
n 182677
14.0%
t 181366
13.9%
e 178419
13.7%
P 177498
13.6%
l 177498
13.6%
i 9051
 
0.7%
F 5161
 
0.4%
u 5161
 
0.4%
g 5161
 
0.4%
Other values (12) 20615
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1115813
85.7%
Uppercase Letter 186511
 
14.3%
Space Separator 18
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 359735
32.2%
n 182677
16.4%
t 181366
16.3%
e 178419
16.0%
l 177498
15.9%
i 9051
 
0.8%
u 5161
 
0.5%
g 5161
 
0.5%
r 3870
 
0.3%
s 3017
 
0.3%
Other values (6) 9858
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
P 177498
95.2%
F 5161
 
2.8%
C 2981
 
1.6%
B 869
 
0.5%
A 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1302324
> 99.9%
Common 18
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 359735
27.6%
n 182677
14.0%
t 181366
13.9%
e 178419
13.7%
P 177498
13.6%
l 177498
13.6%
i 9051
 
0.7%
F 5161
 
0.4%
u 5161
 
0.4%
g 5161
 
0.4%
Other values (11) 20597
 
1.6%
Common
ValueCountFrequency (%)
18
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1302342
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 359735
27.6%
n 182677
14.0%
t 181366
13.9%
e 178419
13.7%
P 177498
13.6%
l 177498
13.6%
i 9051
 
0.7%
F 5161
 
0.4%
u 5161
 
0.4%
g 5161
 
0.4%
Other values (12) 20615
 
1.6%

phylum
Text

Missing 

Distinct17
Distinct (%)< 0.1%
Missing28431
Missing (%)15.2%
Memory size1.4 MiB
2025-01-08T18:32:34.227504image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length12
Mean length11.95495832
Min length8

Characters and Unicode

Total characters1890055
Distinct characters29
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowTracheophyta
2nd rowTracheophyta
3rd rowTracheophyta
4th rowMarchantiophyta
5th rowTracheophyta
ValueCountFrequency (%)
tracheophyta 104064
65.8%
marchantiophyta 21776
 
13.8%
bryophyta 14896
 
9.4%
rhodophyta 5566
 
3.5%
ascomycota 5121
 
3.2%
ochrophyta 2980
 
1.9%
chlorophyta 1763
 
1.1%
cyanobacteria 867
 
0.5%
charophyta 620
 
0.4%
anthocerotophyta 428
 
0.3%
Other values (7) 17
 
< 0.1%
2025-01-08T18:32:34.351155image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 308078
16.3%
h 289292
15.3%
t 180729
9.6%
y 172989
9.2%
o 171415
9.1%
p 152096
8.0%
r 147397
7.8%
c 140372
7.4%
e 105363
 
5.6%
T 104064
 
5.5%
Other values (19) 118260
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1731957
91.6%
Uppercase Letter 158098
 
8.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 308078
17.8%
h 289292
16.7%
t 180729
10.4%
y 172989
10.0%
o 171415
9.9%
p 152096
8.8%
r 147397
8.5%
c 140372
8.1%
e 105363
 
6.1%
n 23074
 
1.3%
Other values (9) 41152
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
T 104064
65.8%
M 21778
 
13.8%
B 14905
 
9.4%
R 5566
 
3.5%
A 5550
 
3.5%
C 3251
 
2.1%
O 2980
 
1.9%
E 2
 
< 0.1%
G 1
 
< 0.1%
F 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1890055
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 308078
16.3%
h 289292
15.3%
t 180729
9.6%
y 172989
9.2%
o 171415
9.1%
p 152096
8.0%
r 147397
7.8%
c 140372
7.4%
e 105363
 
5.6%
T 104064
 
5.5%
Other values (19) 118260
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1890055
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 308078
16.3%
h 289292
15.3%
t 180729
9.6%
y 172989
9.2%
o 171415
9.1%
p 152096
8.0%
r 147397
7.8%
c 140372
7.4%
e 105363
 
5.6%
T 104064
 
5.5%
Other values (19) 118260
 
6.3%

class
Text

Missing 

Distinct49
Distinct (%)< 0.1%
Missing28457
Missing (%)15.3%
Memory size1.4 MiB
2025-01-08T18:32:34.428379image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length18
Mean length12.76794752
Min length7

Characters and Unicode

Total characters2018255
Distinct characters39
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowLiliopsida
2nd rowMagnoliopsida
3rd rowLiliopsida
4th rowJungermanniopsida
5th rowMagnoliopsida
ValueCountFrequency (%)
magnoliopsida 47673
30.2%
liliopsida 34102
21.6%
jungermanniopsida 19329
12.2%
polypodiopsida 18403
 
11.6%
bryopsida 11812
 
7.5%
florideophyceae 5407
 
3.4%
lecanoromycetes 4619
 
2.9%
phaeophyceae 2903
 
1.8%
marchantiopsida 2447
 
1.5%
sphagnopsida 2360
 
1.5%
Other values (39) 9017
 
5.7%
2025-01-08T18:32:34.563638image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 309597
15.3%
o 259227
12.8%
a 237559
11.8%
p 174973
8.7%
d 167339
8.3%
s 146300
7.2%
n 118880
 
5.9%
l 108074
 
5.4%
g 69804
 
3.5%
e 66212
 
3.3%
Other values (29) 360290
17.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1860183
92.2%
Uppercase Letter 158072
 
7.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 309597
16.6%
o 259227
13.9%
a 237559
12.8%
p 174973
9.4%
d 167339
9.0%
s 146300
7.9%
n 118880
 
6.4%
l 108074
 
5.8%
g 69804
 
3.8%
e 66212
 
3.6%
Other values (12) 202218
10.9%
Uppercase Letter
ValueCountFrequency (%)
M 50120
31.7%
L 40823
25.8%
P 23671
15.0%
J 19329
 
12.2%
B 11983
 
7.6%
F 5407
 
3.4%
S 2396
 
1.5%
C 1691
 
1.1%
U 1428
 
0.9%
A 622
 
0.4%
Other values (7) 602
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 2018255
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 309597
15.3%
o 259227
12.8%
a 237559
11.8%
p 174973
8.7%
d 167339
8.3%
s 146300
7.2%
n 118880
 
5.9%
l 108074
 
5.4%
g 69804
 
3.5%
e 66212
 
3.3%
Other values (29) 360290
17.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2018255
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 309597
15.3%
o 259227
12.8%
a 237559
11.8%
p 174973
8.7%
d 167339
8.3%
s 146300
7.2%
n 118880
 
5.9%
l 108074
 
5.4%
g 69804
 
3.5%
e 66212
 
3.3%
Other values (29) 360290
17.9%

order
Text

Missing 

Distinct249
Distinct (%)0.2%
Missing28496
Missing (%)15.3%
Memory size1.4 MiB
2025-01-08T18:32:34.697574image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length15
Mean length9.99117273
Min length6

Characters and Unicode

Total characters1578935
Distinct characters49
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)< 0.1%

Sample

1st rowPoales
2nd rowGentianales
3rd rowPoales
4th rowJungermanniales
5th rowLamiales
ValueCountFrequency (%)
poales 23133
 
14.6%
polypodiales 14202
 
9.0%
jungermanniales 11845
 
7.5%
asterales 7806
 
4.9%
asparagales 5708
 
3.6%
hypnales 5685
 
3.6%
fabales 5474
 
3.5%
porellales 5373
 
3.4%
lamiales 4708
 
3.0%
rosales 3883
 
2.5%
Other values (239) 70216
44.4%
2025-01-08T18:32:34.898207image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 241522
15.3%
l 214346
13.6%
e 204168
12.9%
s 189382
12.0%
i 88627
 
5.6%
o 84012
 
5.3%
n 68943
 
4.4%
r 64166
 
4.1%
P 48956
 
3.1%
p 44954
 
2.8%
Other values (39) 329859
20.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1420900
90.0%
Uppercase Letter 158034
 
10.0%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 241522
17.0%
l 214346
15.1%
e 204168
14.4%
s 189382
13.3%
i 88627
 
6.2%
o 84012
 
5.9%
n 68943
 
4.9%
r 64166
 
4.5%
p 44954
 
3.2%
y 35806
 
2.5%
Other values (15) 184974
13.0%
Uppercase Letter
ValueCountFrequency (%)
P 48956
31.0%
A 18515
 
11.7%
J 11845
 
7.5%
L 10817
 
6.8%
C 10375
 
6.6%
F 9147
 
5.8%
M 8185
 
5.2%
H 6767
 
4.3%
S 6495
 
4.1%
R 6216
 
3.9%
Other values (13) 20716
13.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1578934
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 241522
15.3%
l 214346
13.6%
e 204168
12.9%
s 189382
12.0%
i 88627
 
5.6%
o 84012
 
5.3%
n 68943
 
4.4%
r 64166
 
4.1%
P 48956
 
3.1%
p 44954
 
2.8%
Other values (38) 329858
20.9%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1578935
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 241522
15.3%
l 214346
13.6%
e 204168
12.9%
s 189382
12.0%
i 88627
 
5.6%
o 84012
 
5.3%
n 68943
 
4.4%
r 64166
 
4.1%
P 48956
 
3.1%
p 44954
 
2.8%
Other values (39) 329859
20.9%

family
Text

Missing 

Distinct815
Distinct (%)0.5%
Missing28710
Missing (%)15.4%
Memory size1.4 MiB
2025-01-08T18:32:35.056230image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length18
Mean length11.48949113
Min length7

Characters and Unicode

Total characters1813260
Distinct characters52
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)0.1%

Sample

1st rowJuncaceae
2nd rowGentianaceae
3rd rowCyperaceae
4th rowLophocoleaceae
5th rowPhrymaceae
ValueCountFrequency (%)
cyperaceae 13776
 
8.7%
asteraceae 7289
 
4.6%
poaceae 6277
 
4.0%
fabaceae 4763
 
3.0%
orchidaceae 3466
 
2.2%
dryopteridaceae 3452
 
2.2%
rosaceae 3290
 
2.1%
pteridaceae 3008
 
1.9%
sphagnaceae 2360
 
1.5%
juncaceae 2219
 
1.4%
Other values (805) 107919
68.4%
2025-01-08T18:32:35.281331image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 392339
21.6%
a 387850
21.4%
c 194926
10.8%
i 92782
 
5.1%
r 83018
 
4.6%
o 72274
 
4.0%
l 62315
 
3.4%
n 50368
 
2.8%
p 48418
 
2.7%
t 46382
 
2.6%
Other values (42) 382588
21.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1655439
91.3%
Uppercase Letter 157820
 
8.7%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 392339
23.7%
a 387850
23.4%
c 194926
11.8%
i 92782
 
5.6%
r 83018
 
5.0%
o 72274
 
4.4%
l 62315
 
3.8%
n 50368
 
3.0%
p 48418
 
2.9%
t 46382
 
2.8%
Other values (16) 224767
13.6%
Uppercase Letter
ValueCountFrequency (%)
C 25797
16.3%
P 24928
15.8%
A 20318
12.9%
L 11364
7.2%
S 11029
7.0%
R 9304
 
5.9%
F 8346
 
5.3%
O 7185
 
4.6%
D 6508
 
4.1%
B 6021
 
3.8%
Other values (15) 27020
17.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1813259
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 392339
21.6%
a 387850
21.4%
c 194926
10.8%
i 92782
 
5.1%
r 83018
 
4.6%
o 72274
 
4.0%
l 62315
 
3.4%
n 50368
 
2.8%
p 48418
 
2.7%
t 46382
 
2.6%
Other values (41) 382587
21.1%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1813260
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 392339
21.6%
a 387850
21.4%
c 194926
10.8%
i 92782
 
5.1%
r 83018
 
4.6%
o 72274
 
4.0%
l 62315
 
3.4%
n 50368
 
2.8%
p 48418
 
2.7%
t 46382
 
2.6%
Other values (42) 382588
21.1%

genus
Text

Missing 

Distinct4085
Distinct (%)2.6%
Missing28788
Missing (%)15.4%
Memory size1.4 MiB
2025-01-08T18:32:35.480526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length16
Mean length8.853595451
Min length3

Characters and Unicode

Total characters1396575
Distinct characters53
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1041 ?
Unique (%)0.7%

Sample

1st rowLuzula
2nd rowGentiana
3rd rowCarex
4th rowLophocolea
5th rowMimulus
ValueCountFrequency (%)
carex 8831
 
5.6%
sphagnum 2360
 
1.5%
dryopteris 2269
 
1.4%
juncus 1784
 
1.1%
frullania 1708
 
1.1%
asplenium 1705
 
1.1%
scapania 1518
 
1.0%
sargassum 1505
 
1.0%
polypodium 1375
 
0.9%
viola 1213
 
0.8%
Other values (4075) 133473
84.6%
2025-01-08T18:32:35.748562image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 159567
 
11.4%
i 125119
 
9.0%
e 94983
 
6.8%
o 94015
 
6.7%
r 89664
 
6.4%
l 84777
 
6.1%
u 81100
 
5.8%
s 71024
 
5.1%
n 64715
 
4.6%
m 60888
 
4.4%
Other values (43) 470723
33.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1238724
88.7%
Uppercase Letter 157741
 
11.3%
Dash Punctuation 110
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 159567
12.9%
i 125119
 
10.1%
e 94983
 
7.7%
o 94015
 
7.6%
r 89664
 
7.2%
l 84777
 
6.8%
u 81100
 
6.5%
s 71024
 
5.7%
n 64715
 
5.2%
m 60888
 
4.9%
Other values (16) 312872
25.3%
Uppercase Letter
ValueCountFrequency (%)
C 25041
15.9%
P 20312
12.9%
S 19066
12.1%
A 14007
 
8.9%
L 9057
 
5.7%
D 8815
 
5.6%
R 6625
 
4.2%
M 6331
 
4.0%
E 6247
 
4.0%
B 5844
 
3.7%
Other values (16) 36396
23.1%
Dash Punctuation
ValueCountFrequency (%)
- 110
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1396465
> 99.9%
Common 110
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 159567
 
11.4%
i 125119
 
9.0%
e 94983
 
6.8%
o 94015
 
6.7%
r 89664
 
6.4%
l 84777
 
6.1%
u 81100
 
5.8%
s 71024
 
5.1%
n 64715
 
4.6%
m 60888
 
4.4%
Other values (42) 470613
33.7%
Common
ValueCountFrequency (%)
- 110
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1396575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 159567
 
11.4%
i 125119
 
9.0%
e 94983
 
6.8%
o 94015
 
6.7%
r 89664
 
6.4%
l 84777
 
6.1%
u 81100
 
5.8%
s 71024
 
5.1%
n 64715
 
4.6%
m 60888
 
4.4%
Other values (43) 470723
33.7%

genericName
Text

Missing 

Distinct3709
Distinct (%)2.4%
Missing28825
Missing (%)15.5%
Memory size1.4 MiB
2025-01-08T18:32:35.941289image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length15
Mean length8.589090955
Min length3

Characters and Unicode

Total characters1354534
Distinct characters53
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique987 ?
Unique (%)0.6%

Sample

1st rowLuzula
2nd rowGentiana
3rd rowCarex
4th rowLophocolea
5th rowMimulus
ValueCountFrequency (%)
carex 8803
 
5.6%
sphagnum 2360
 
1.5%
dryopteris 2266
 
1.4%
juncus 1814
 
1.2%
frullania 1708
 
1.1%
asplenium 1557
 
1.0%
scapania 1517
 
1.0%
sargassum 1504
 
1.0%
polypodium 1453
 
0.9%
scirpus 1280
 
0.8%
Other values (3699) 133442
84.6%
2025-01-08T18:32:36.194850image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 157291
 
11.6%
i 121653
 
9.0%
e 90240
 
6.7%
o 90134
 
6.7%
r 87682
 
6.5%
u 80913
 
6.0%
l 79580
 
5.9%
s 65454
 
4.8%
n 63491
 
4.7%
m 62423
 
4.6%
Other values (43) 455673
33.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1196830
88.4%
Uppercase Letter 157704
 
11.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 157291
13.1%
i 121653
10.2%
e 90240
 
7.5%
o 90134
 
7.5%
r 87682
 
7.3%
u 80913
 
6.8%
l 79580
 
6.6%
s 65454
 
5.5%
n 63491
 
5.3%
m 62423
 
5.2%
Other values (17) 297969
24.9%
Uppercase Letter
ValueCountFrequency (%)
C 26048
16.5%
P 20798
13.2%
S 16971
10.8%
A 13940
 
8.8%
L 10843
 
6.9%
D 7729
 
4.9%
R 6963
 
4.4%
E 6742
 
4.3%
B 6350
 
4.0%
M 5932
 
3.8%
Other values (16) 35388
22.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 1354534
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 157291
 
11.6%
i 121653
 
9.0%
e 90240
 
6.7%
o 90134
 
6.7%
r 87682
 
6.5%
u 80913
 
6.0%
l 79580
 
5.9%
s 65454
 
4.8%
n 63491
 
4.7%
m 62423
 
4.6%
Other values (43) 455673
33.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1354533
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 157291
 
11.6%
i 121653
 
9.0%
e 90240
 
6.7%
o 90134
 
6.7%
r 87682
 
6.5%
u 80913
 
6.0%
l 79580
 
5.9%
s 65454
 
4.8%
n 63491
 
4.7%
m 62423
 
4.6%
Other values (42) 455672
33.6%
None
ValueCountFrequency (%)
ë 1
100.0%

specificEpithet
Text

Missing 

Distinct6756
Distinct (%)5.1%
Missing54371
Missing (%)29.1%
Memory size1.4 MiB
2025-01-08T18:32:36.393952image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length17
Mean length9.061328107
Min length3

Characters and Unicode

Total characters1197527
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2228 ?
Unique (%)1.7%

Sample

1st rowbulbosa
2nd rowclausa
3rd rowmuhlenbergii
4th rowminor
5th rowringens
ValueCountFrequency (%)
canadensis 1478
 
1.1%
virginiana 723
 
0.5%
palustris 710
 
0.5%
canadense 699
 
0.5%
americana 680
 
0.5%
virginica 544
 
0.4%
pubescens 506
 
0.4%
virginianum 501
 
0.4%
heterophylla 501
 
0.4%
nemorosa 495
 
0.4%
Other values (6746) 125321
94.8%
2025-01-08T18:32:36.659959image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 163211
13.6%
i 133654
11.2%
l 84576
 
7.1%
s 84520
 
7.1%
e 82393
 
6.9%
r 79195
 
6.6%
u 77754
 
6.5%
n 74791
 
6.2%
t 65069
 
5.4%
o 63126
 
5.3%
Other values (17) 289238
24.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1196844
99.9%
Dash Punctuation 683
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 163211
13.6%
i 133654
11.2%
l 84576
 
7.1%
s 84520
 
7.1%
e 82393
 
6.9%
r 79195
 
6.6%
u 77754
 
6.5%
n 74791
 
6.2%
t 65069
 
5.4%
o 63126
 
5.3%
Other values (16) 288555
24.1%
Dash Punctuation
ValueCountFrequency (%)
- 683
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1196844
99.9%
Common 683
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 163211
13.6%
i 133654
11.2%
l 84576
 
7.1%
s 84520
 
7.1%
e 82393
 
6.9%
r 79195
 
6.6%
u 77754
 
6.5%
n 74791
 
6.2%
t 65069
 
5.4%
o 63126
 
5.3%
Other values (16) 288555
24.1%
Common
ValueCountFrequency (%)
- 683
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1197527
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 163211
13.6%
i 133654
11.2%
l 84576
 
7.1%
s 84520
 
7.1%
e 82393
 
6.9%
r 79195
 
6.6%
u 77754
 
6.5%
n 74791
 
6.2%
t 65069
 
5.4%
o 63126
 
5.3%
Other values (17) 289238
24.2%

infraspecificEpithet
Text

Missing 

Distinct1039
Distinct (%)23.8%
Missing182164
Missing (%)97.7%
Memory size1.4 MiB
2025-01-08T18:32:36.842054image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length13
Mean length9.057961054
Min length4

Characters and Unicode

Total characters39538
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique513 ?
Unique (%)11.8%

Sample

1st rowtenuifolia
2nd rowpauciflorus
3rd rowabbreviatus
4th rowbellidiastrum
5th rowangustatus
ValueCountFrequency (%)
rufescens 97
 
2.2%
americana 73
 
1.7%
intermedia 62
 
1.4%
lanceolatum 58
 
1.3%
gigantea 49
 
1.1%
ciliare 45
 
1.0%
elatum 40
 
0.9%
gracilis 39
 
0.9%
variolosa 39
 
0.9%
pubescens 37
 
0.8%
Other values (1029) 3826
87.7%
2025-01-08T18:32:37.082446image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5377
13.6%
i 4420
11.2%
l 2907
 
7.4%
s 2842
 
7.2%
e 2822
 
7.1%
r 2570
 
6.5%
u 2464
 
6.2%
n 2336
 
5.9%
o 2194
 
5.5%
c 2104
 
5.3%
Other values (17) 9502
24.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 39535
> 99.9%
Dash Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5377
13.6%
i 4420
11.2%
l 2907
 
7.4%
s 2842
 
7.2%
e 2822
 
7.1%
r 2570
 
6.5%
u 2464
 
6.2%
n 2336
 
5.9%
o 2194
 
5.5%
c 2104
 
5.3%
Other values (16) 9499
24.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 39535
> 99.9%
Common 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5377
13.6%
i 4420
11.2%
l 2907
 
7.4%
s 2842
 
7.2%
e 2822
 
7.1%
r 2570
 
6.5%
u 2464
 
6.2%
n 2336
 
5.9%
o 2194
 
5.5%
c 2104
 
5.3%
Other values (16) 9499
24.0%
Common
ValueCountFrequency (%)
- 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39538
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5377
13.6%
i 4420
11.2%
l 2907
 
7.4%
s 2842
 
7.2%
e 2822
 
7.1%
r 2570
 
6.5%
u 2464
 
6.2%
n 2336
 
5.9%
o 2194
 
5.5%
c 2104
 
5.3%
Other values (17) 9502
24.0%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:37.140769image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.728985841
Min length4

Characters and Unicode

Total characters1255151
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSPECIES
2nd rowSPECIES
3rd rowSPECIES
4th rowSPECIES
5th rowKINGDOM
ValueCountFrequency (%)
species 127830
68.5%
kingdom 28426
 
15.2%
genus 25546
 
13.7%
variety 3379
 
1.8%
subspecies 663
 
0.4%
form 323
 
0.2%
family 230
 
0.1%
order 93
 
< 0.1%
class 25
 
< 0.1%
phylum 14
 
< 0.1%
2025-01-08T18:32:37.249392image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 286004
22.8%
S 283245
22.6%
I 160528
12.8%
C 128518
10.2%
P 128507
10.2%
G 53972
 
4.3%
N 53972
 
4.3%
M 28993
 
2.3%
O 28842
 
2.3%
D 28519
 
2.3%
Other values (11) 74051
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1255151
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 286004
22.8%
S 283245
22.6%
I 160528
12.8%
C 128518
10.2%
P 128507
10.2%
G 53972
 
4.3%
N 53972
 
4.3%
M 28993
 
2.3%
O 28842
 
2.3%
D 28519
 
2.3%
Other values (11) 74051
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 1255151
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 286004
22.8%
S 283245
22.6%
I 160528
12.8%
C 128518
10.2%
P 128507
10.2%
G 53972
 
4.3%
N 53972
 
4.3%
M 28993
 
2.3%
O 28842
 
2.3%
D 28519
 
2.3%
Other values (11) 74051
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1255151
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 286004
22.8%
S 283245
22.6%
I 160528
12.8%
C 128518
10.2%
P 128507
10.2%
G 53972
 
4.3%
N 53972
 
4.3%
M 28993
 
2.3%
O 28842
 
2.3%
D 28519
 
2.3%
Other values (11) 74051
 
5.9%
Distinct179
Distinct (%)0.1%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-08T18:32:37.426730image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length98
Median length78
Mean length29.30921501
Min length5

Characters and Unicode

Total characters5466491
Distinct characters38
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)< 0.1%

Sample

1st rowrushes; angiosperms; tracheophytes; plants
2nd rowgentians; angiosperms; tracheophytes; plants
3rd rowsedges; angiosperms; tracheophytes; plants
4th rowliverworts; mosses; plants
5th rowplants; plants
ValueCountFrequency (%)
plants 205898
36.1%
tracheophytes 104057
18.2%
angiosperms 81757
 
14.3%
mosses 38430
 
6.7%
liverworts 21780
 
3.8%
sedges 13776
 
2.4%
algae 7363
 
1.3%
sunflowers 7281
 
1.3%
grasses 6277
 
1.1%
ferns 5822
 
1.0%
Other values (217) 78501
 
13.7%
2025-01-08T18:32:37.682168image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 752120
13.8%
e 464919
 
8.5%
t 458192
 
8.4%
a 433633
 
7.9%
p 404044
 
7.4%
384431
 
7.0%
; 364831
 
6.7%
n 324111
 
5.9%
o 294499
 
5.4%
r 290716
 
5.3%
Other values (28) 1294995
23.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4711435
86.2%
Space Separator 384431
 
7.0%
Other Punctuation 367273
 
6.7%
Dash Punctuation 2905
 
0.1%
Uppercase Letter 269
 
< 0.1%
Open Punctuation 89
 
< 0.1%
Close Punctuation 89
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 752120
16.0%
e 464919
9.9%
t 458192
9.7%
a 433633
9.2%
p 404044
8.6%
n 324111
6.9%
o 294499
 
6.3%
r 290716
 
6.2%
l 264824
 
5.6%
h 223345
 
4.7%
Other values (15) 801032
17.0%
Uppercase Letter
ValueCountFrequency (%)
A 78
29.0%
G 62
23.0%
J 54
20.1%
B 47
17.5%
P 27
 
10.0%
H 1
 
0.4%
Other Punctuation
ValueCountFrequency (%)
; 364831
99.3%
, 1369
 
0.4%
' 1073
 
0.3%
Space Separator
ValueCountFrequency (%)
384431
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2905
100.0%
Open Punctuation
ValueCountFrequency (%)
( 89
100.0%
Close Punctuation
ValueCountFrequency (%)
) 89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4711704
86.2%
Common 754787
 
13.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 752120
16.0%
e 464919
9.9%
t 458192
9.7%
a 433633
9.2%
p 404044
8.6%
n 324111
6.9%
o 294499
 
6.3%
r 290716
 
6.2%
l 264824
 
5.6%
h 223345
 
4.7%
Other values (21) 801301
17.0%
Common
ValueCountFrequency (%)
384431
50.9%
; 364831
48.3%
- 2905
 
0.4%
, 1369
 
0.2%
' 1073
 
0.1%
( 89
 
< 0.1%
) 89
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5466491
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 752120
13.8%
e 464919
 
8.5%
t 458192
 
8.4%
a 433633
 
7.9%
p 404044
 
7.4%
384431
 
7.0%
; 364831
 
6.7%
n 324111
 
5.9%
o 294499
 
5.4%
r 290716
 
5.3%
Other values (28) 1294995
23.7%

nomenclaturalCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:37.734240image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters746116
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowICBN
2nd rowICBN
3rd rowICBN
4th rowICBN
5th rowICBN
ValueCountFrequency (%)
icbn 186529
100.0%
2025-01-08T18:32:37.827803image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 186529
25.0%
C 186529
25.0%
B 186529
25.0%
N 186529
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 746116
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 186529
25.0%
C 186529
25.0%
B 186529
25.0%
N 186529
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 746116
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 186529
25.0%
C 186529
25.0%
B 186529
25.0%
N 186529
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 746116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 186529
25.0%
C 186529
25.0%
B 186529
25.0%
N 186529
25.0%
Distinct3
Distinct (%)< 0.1%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-08T18:32:37.873801image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.759520886
Min length7

Characters and Unicode

Total characters1447236
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowACCEPTED
2nd rowACCEPTED
3rd rowSYNONYM
4th rowACCEPTED
5th rowACCEPTED
ValueCountFrequency (%)
accepted 139841
75.0%
synonym 44852
 
24.0%
doubtful 1818
 
1.0%
2025-01-08T18:32:37.974682image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 279682
19.3%
E 279682
19.3%
T 141659
9.8%
D 141659
9.8%
A 139841
9.7%
P 139841
9.7%
Y 89704
 
6.2%
N 89704
 
6.2%
O 46670
 
3.2%
S 44852
 
3.1%
Other values (5) 53942
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1447236
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 279682
19.3%
E 279682
19.3%
T 141659
9.8%
D 141659
9.8%
A 139841
9.7%
P 139841
9.7%
Y 89704
 
6.2%
N 89704
 
6.2%
O 46670
 
3.2%
S 44852
 
3.1%
Other values (5) 53942
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 1447236
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 279682
19.3%
E 279682
19.3%
T 141659
9.8%
D 141659
9.8%
A 139841
9.7%
P 139841
9.7%
Y 89704
 
6.2%
N 89704
 
6.2%
O 46670
 
3.2%
S 44852
 
3.1%
Other values (5) 53942
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1447236
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 279682
19.3%
E 279682
19.3%
T 141659
9.8%
D 141659
9.8%
A 139841
9.7%
P 139841
9.7%
Y 89704
 
6.2%
N 89704
 
6.2%
O 46670
 
3.2%
S 44852
 
3.1%
Other values (5) 53942
 
3.7%

taxonRemarks
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:38.018377image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length26
Median length26
Mean length26
Min length26

Characters and Unicode

Total characters4849754
Distinct characters12
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimals and Plants: Plants
2nd rowAnimals and Plants: Plants
3rd rowAnimals and Plants: Plants
4th rowAnimals and Plants: Plants
5th rowAnimals and Plants: Plants
ValueCountFrequency (%)
plants 373058
50.0%
animals 186529
25.0%
and 186529
25.0%
2025-01-08T18:32:38.213282image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 746116
15.4%
a 746116
15.4%
l 559587
11.5%
s 559587
11.5%
559587
11.5%
P 373058
7.7%
t 373058
7.7%
A 186529
 
3.8%
i 186529
 
3.8%
m 186529
 
3.8%
Other values (2) 373058
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3544051
73.1%
Space Separator 559587
 
11.5%
Uppercase Letter 559587
 
11.5%
Other Punctuation 186529
 
3.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 746116
21.1%
a 746116
21.1%
l 559587
15.8%
s 559587
15.8%
t 373058
10.5%
i 186529
 
5.3%
m 186529
 
5.3%
d 186529
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
P 373058
66.7%
A 186529
33.3%
Space Separator
ValueCountFrequency (%)
559587
100.0%
Other Punctuation
ValueCountFrequency (%)
: 186529
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4103638
84.6%
Common 746116
 
15.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 746116
18.2%
a 746116
18.2%
l 559587
13.6%
s 559587
13.6%
P 373058
9.1%
t 373058
9.1%
A 186529
 
4.5%
i 186529
 
4.5%
m 186529
 
4.5%
d 186529
 
4.5%
Common
ValueCountFrequency (%)
559587
75.0%
: 186529
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4849754
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 746116
15.4%
a 746116
15.4%
l 559587
11.5%
s 559587
11.5%
559587
11.5%
P 373058
7.7%
t 373058
7.7%
A 186529
 
3.8%
i 186529
 
3.8%
m 186529
 
3.8%
Other values (2) 373058
7.7%

datasetKey
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:38.265420image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters6715044
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row963f12d0-f762-11e1-a439-00145eb45e9a
2nd row963f12d0-f762-11e1-a439-00145eb45e9a
3rd row963f12d0-f762-11e1-a439-00145eb45e9a
4th row963f12d0-f762-11e1-a439-00145eb45e9a
5th row963f12d0-f762-11e1-a439-00145eb45e9a
ValueCountFrequency (%)
963f12d0-f762-11e1-a439-00145eb45e9a 186529
100.0%
2025-01-08T18:32:38.368248image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 932645
13.9%
- 746116
11.1%
9 559587
8.3%
0 559587
8.3%
e 559587
8.3%
4 559587
8.3%
6 373058
 
5.6%
3 373058
 
5.6%
f 373058
 
5.6%
2 373058
 
5.6%
Other values (5) 1305703
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4290167
63.9%
Lowercase Letter 1678761
 
25.0%
Dash Punctuation 746116
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 932645
21.7%
9 559587
13.0%
0 559587
13.0%
4 559587
13.0%
6 373058
 
8.7%
3 373058
 
8.7%
2 373058
 
8.7%
5 373058
 
8.7%
7 186529
 
4.3%
Lowercase Letter
ValueCountFrequency (%)
e 559587
33.3%
f 373058
22.2%
a 373058
22.2%
d 186529
 
11.1%
b 186529
 
11.1%
Dash Punctuation
ValueCountFrequency (%)
- 746116
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5036283
75.0%
Latin 1678761
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 932645
18.5%
- 746116
14.8%
9 559587
11.1%
0 559587
11.1%
4 559587
11.1%
6 373058
 
7.4%
3 373058
 
7.4%
2 373058
 
7.4%
5 373058
 
7.4%
7 186529
 
3.7%
Latin
ValueCountFrequency (%)
e 559587
33.3%
f 373058
22.2%
a 373058
22.2%
d 186529
 
11.1%
b 186529
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6715044
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 932645
13.9%
- 746116
11.1%
9 559587
8.3%
0 559587
8.3%
e 559587
8.3%
4 559587
8.3%
6 373058
 
5.6%
3 373058
 
5.6%
f 373058
 
5.6%
2 373058
 
5.6%
Other values (5) 1305703
19.4%

publishingCountry
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:38.408760image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters373058
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 186529
100.0%
2025-01-08T18:32:38.500055image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 186529
50.0%
S 186529
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 373058
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 186529
50.0%
S 186529
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 373058
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 186529
50.0%
S 186529
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 373058
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 186529
50.0%
S 186529
50.0%
Distinct18062
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:38.598984image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99661179
Min length20

Characters and Unicode

Total characters4476064
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1510 ?
Unique (%)0.8%

Sample

1st row2025-01-07T13:09:13.317Z
2nd row2025-01-07T13:09:13.317Z
3rd row2025-01-07T13:09:13.317Z
4th row2025-01-07T13:09:13.317Z
5th row2025-01-07T13:09:13.318Z
ValueCountFrequency (%)
2025-01-07t13:09:09.691z 61
 
< 0.1%
2025-01-07t13:09:14.842z 55
 
< 0.1%
2025-01-07t13:09:13.153z 54
 
< 0.1%
2025-01-07t13:09:14.841z 53
 
< 0.1%
2025-01-07t13:09:14.268z 52
 
< 0.1%
2025-01-07t13:09:10.471z 51
 
< 0.1%
2025-01-07t13:09:13.231z 51
 
< 0.1%
2025-01-07t13:09:13.898z 51
 
< 0.1%
2025-01-07t13:09:12.942z 51
 
< 0.1%
2025-01-07t13:09:14.350z 50
 
< 0.1%
Other values (18052) 186000
99.7%
2025-01-08T18:32:38.772773image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 902065
20.2%
1 544038
12.2%
2 450890
10.1%
- 373058
8.3%
: 373058
8.3%
3 270263
 
6.0%
5 267401
 
6.0%
7 257877
 
5.8%
9 254149
 
5.7%
T 186529
 
4.2%
Other values (5) 596736
13.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3170519
70.8%
Other Punctuation 559429
 
12.5%
Dash Punctuation 373058
 
8.3%
Uppercase Letter 373058
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 902065
28.5%
1 544038
17.2%
2 450890
14.2%
3 270263
 
8.5%
5 267401
 
8.4%
7 257877
 
8.1%
9 254149
 
8.0%
4 80989
 
2.6%
8 73891
 
2.3%
6 68956
 
2.2%
Other Punctuation
ValueCountFrequency (%)
: 373058
66.7%
. 186371
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 186529
50.0%
Z 186529
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 373058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4103006
91.7%
Latin 373058
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 902065
22.0%
1 544038
13.3%
2 450890
11.0%
- 373058
9.1%
: 373058
9.1%
3 270263
 
6.6%
5 267401
 
6.5%
7 257877
 
6.3%
9 254149
 
6.2%
. 186371
 
4.5%
Other values (3) 223836
 
5.5%
Latin
ValueCountFrequency (%)
T 186529
50.0%
Z 186529
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4476064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 902065
20.2%
1 544038
12.2%
2 450890
10.1%
- 373058
8.3%
: 373058
8.3%
3 270263
 
6.0%
5 267401
 
6.0%
7 257877
 
5.8%
9 254149
 
5.7%
T 186529
 
4.2%
Other values (5) 596736
13.3%

elevation
Text

Missing 

Distinct751
Distinct (%)9.9%
Missing178933
Missing (%)95.9%
Memory size1.4 MiB
2025-01-08T18:32:38.968335image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length5
Mean length5.40297525
Min length3

Characters and Unicode

Total characters41041
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique278 ?
Unique (%)3.7%

Sample

1st row564.0
2nd row1500.0
3rd row1012.0
4th row137.0
5th row1463.0
ValueCountFrequency (%)
1524.0 271
 
3.6%
305.0 236
 
3.1%
1219.0 194
 
2.6%
1829.0 188
 
2.5%
366.0 170
 
2.2%
914.0 168
 
2.2%
610.0 162
 
2.1%
2743.0 156
 
2.1%
762.0 151
 
2.0%
244.0 150
 
2.0%
Other values (741) 5750
75.7%
2025-01-08T18:32:39.228897image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 10709
26.1%
. 7596
18.5%
1 4465
10.9%
2 3932
 
9.6%
5 2676
 
6.5%
3 2484
 
6.1%
4 2352
 
5.7%
6 1935
 
4.7%
7 1714
 
4.2%
8 1661
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 33445
81.5%
Other Punctuation 7596
 
18.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 10709
32.0%
1 4465
13.4%
2 3932
 
11.8%
5 2676
 
8.0%
3 2484
 
7.4%
4 2352
 
7.0%
6 1935
 
5.8%
7 1714
 
5.1%
8 1661
 
5.0%
9 1517
 
4.5%
Other Punctuation
ValueCountFrequency (%)
. 7596
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 41041
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 10709
26.1%
. 7596
18.5%
1 4465
10.9%
2 3932
 
9.6%
5 2676
 
6.5%
3 2484
 
6.1%
4 2352
 
5.7%
6 1935
 
4.7%
7 1714
 
4.2%
8 1661
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 41041
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 10709
26.1%
. 7596
18.5%
1 4465
10.9%
2 3932
 
9.6%
5 2676
 
6.5%
3 2484
 
6.1%
4 2352
 
5.7%
6 1935
 
4.7%
7 1714
 
4.2%
8 1661
 
4.0%

elevationAccuracy
Text

Missing 

Distinct77
Distinct (%)10.5%
Missing185793
Missing (%)99.6%
Memory size1.4 MiB
2025-01-08T18:32:39.334827image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.486413043
Min length3

Characters and Unicode

Total characters3302
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)3.8%

Sample

1st row50.0
2nd row150.0
3rd row50.0
4th row50.0
5th row172.5
ValueCountFrequency (%)
50.0 100
 
13.6%
327.5 62
 
8.4%
100.0 59
 
8.0%
152.5 57
 
7.7%
0.0 48
 
6.5%
150.0 40
 
5.4%
390.0 26
 
3.5%
76.0 24
 
3.3%
62.5 23
 
3.1%
381.0 19
 
2.6%
Other values (67) 278
37.8%
2025-01-08T18:32:39.486535image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 973
29.5%
. 736
22.3%
5 571
17.3%
1 242
 
7.3%
2 239
 
7.2%
3 186
 
5.6%
7 159
 
4.8%
6 80
 
2.4%
4 43
 
1.3%
8 38
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2566
77.7%
Other Punctuation 736
 
22.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 973
37.9%
5 571
22.3%
1 242
 
9.4%
2 239
 
9.3%
3 186
 
7.2%
7 159
 
6.2%
6 80
 
3.1%
4 43
 
1.7%
8 38
 
1.5%
9 35
 
1.4%
Other Punctuation
ValueCountFrequency (%)
. 736
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3302
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 973
29.5%
. 736
22.3%
5 571
17.3%
1 242
 
7.3%
2 239
 
7.2%
3 186
 
5.6%
7 159
 
4.8%
6 80
 
2.4%
4 43
 
1.3%
8 38
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3302
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 973
29.5%
. 736
22.3%
5 571
17.3%
1 242
 
7.3%
2 239
 
7.2%
3 186
 
5.6%
7 159
 
4.8%
6 80
 
2.4%
4 43
 
1.3%
8 38
 
1.2%
Distinct67
Distinct (%)15.3%
Missing186092
Missing (%)99.8%
Memory size1.4 MiB
2025-01-08T18:32:39.597821image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length15.97482838
Min length3

Characters and Unicode

Total characters6981
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)7.1%

Sample

1st row1170.0899523613987
2nd row4974.498988381608
3rd row4974.498988381608
4th row4131.791168613916
5th row2047.6989123381013
ValueCountFrequency (%)
0.0 51
 
11.7%
2589.9343731029417 42
 
9.6%
4974.498988381608 27
 
6.2%
1360.0074314533344 26
 
5.9%
2047.6989123381013 25
 
5.7%
1632.4374102813665 23
 
5.3%
3512.1947738856975 21
 
4.8%
2503.4790916570705 17
 
3.9%
2092.6375926612645 15
 
3.4%
911.6597020315339 15
 
3.4%
Other values (57) 175
40.0%
2025-01-08T18:32:39.775091image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 864
12.4%
1 814
11.7%
0 738
10.6%
9 678
9.7%
2 612
8.8%
4 611
8.8%
7 602
8.6%
6 563
8.1%
8 547
7.8%
5 515
7.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6544
93.7%
Other Punctuation 437
 
6.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 864
13.2%
1 814
12.4%
0 738
11.3%
9 678
10.4%
2 612
9.4%
4 611
9.3%
7 602
9.2%
6 563
8.6%
8 547
8.4%
5 515
7.9%
Other Punctuation
ValueCountFrequency (%)
. 437
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6981
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 864
12.4%
1 814
11.7%
0 738
10.6%
9 678
9.7%
2 612
8.8%
4 611
8.8%
7 602
8.6%
6 563
8.1%
8 547
7.8%
5 515
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6981
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 864
12.4%
1 814
11.7%
0 738
10.6%
9 678
9.7%
2 612
8.8%
4 611
8.8%
7 602
8.6%
6 563
8.1%
8 547
7.8%
5 515
7.4%

issue
Text

Distinct63
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:39.852176image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length193
Median length95
Mean length97.25331182
Min length95

Characters and Unicode

Total characters18140563
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)< 0.1%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;INSTITUTION_MATCH_FUZZY;COLLECTION_MATCH_FUZZY
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;INSTITUTION_MATCH_FUZZY;COLLECTION_MATCH_FUZZY
3rd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;INSTITUTION_MATCH_FUZZY;COLLECTION_MATCH_FUZZY
4th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;INSTITUTION_MATCH_FUZZY;COLLECTION_MATCH_FUZZY
5th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;INSTITUTION_MATCH_FUZZY;COLLECTION_MATCH_FUZZY
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count;institution_match_fuzzy;collection_match_fuzzy 169646
90.9%
occurrence_status_inferred_from_individual_count;coordinate_rounded;institution_match_fuzzy;collection_match_fuzzy 5872
 
3.1%
occurrence_status_inferred_from_individual_count;taxon_match_higherrank;institution_match_fuzzy;collection_match_fuzzy 2416
 
1.3%
occurrence_status_inferred_from_individual_count;recorded_date_mismatch;institution_match_fuzzy;collection_match_fuzzy 2089
 
1.1%
occurrence_status_inferred_from_individual_count;taxon_match_fuzzy;institution_match_fuzzy;collection_match_fuzzy 1835
 
1.0%
occurrence_status_inferred_from_individual_count;continent_coordinate_mismatch;institution_match_fuzzy;collection_match_fuzzy 1078
 
0.6%
occurrence_status_inferred_from_individual_count;country_derived_from_coordinates;institution_match_fuzzy;collection_match_fuzzy 939
 
0.5%
occurrence_status_inferred_from_individual_count;continent_derived_from_coordinates;institution_match_fuzzy;collection_match_fuzzy 771
 
0.4%
occurrence_status_inferred_from_individual_count;coordinate_reprojected;institution_match_fuzzy;collection_match_fuzzy 279
 
0.1%
occurrence_status_inferred_from_individual_count;coordinate_rounded;taxon_match_higherrank;institution_match_fuzzy;collection_match_fuzzy 198
 
0.1%
Other values (53) 1406
 
0.8%
2025-01-08T18:32:39.992848image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
T 1713116
 
9.4%
_ 1711007
 
9.4%
C 1518846
 
8.4%
I 1514962
 
8.4%
N 1340024
 
7.4%
U 1316104
 
7.3%
O 1160937
 
6.4%
E 968295
 
5.3%
R 966824
 
5.3%
A 776170
 
4.3%
Other values (18) 5154278
28.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 16037928
88.4%
Connector Punctuation 1711007
 
9.4%
Other Punctuation 391262
 
2.2%
Decimal Number 366
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 1713116
10.7%
C 1518846
 
9.5%
I 1514962
 
9.4%
N 1340024
 
8.4%
U 1316104
 
8.2%
O 1160937
 
7.2%
E 968295
 
6.0%
R 966824
 
6.0%
A 776170
 
4.8%
F 750473
 
4.7%
Other values (14) 4012177
25.0%
Decimal Number
ValueCountFrequency (%)
8 183
50.0%
4 183
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1711007
100.0%
Other Punctuation
ValueCountFrequency (%)
; 391262
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16037928
88.4%
Common 2102635
 
11.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 1713116
10.7%
C 1518846
 
9.5%
I 1514962
 
9.4%
N 1340024
 
8.4%
U 1316104
 
8.2%
O 1160937
 
7.2%
E 968295
 
6.0%
R 966824
 
6.0%
A 776170
 
4.8%
F 750473
 
4.7%
Other values (14) 4012177
25.0%
Common
ValueCountFrequency (%)
_ 1711007
81.4%
; 391262
 
18.6%
8 183
 
< 0.1%
4 183
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18140563
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 1713116
 
9.4%
_ 1711007
 
9.4%
C 1518846
 
8.4%
I 1514962
 
8.4%
N 1340024
 
7.4%
U 1316104
 
7.3%
O 1160937
 
6.4%
E 968295
 
5.3%
R 966824
 
5.3%
A 776170
 
4.3%
Other values (18) 5154278
28.4%

mediaType
Text

Missing 

Distinct9
Distinct (%)< 0.1%
Missing9347
Missing (%)5.0%
Memory size1.4 MiB
2025-01-08T18:32:40.041670image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length98
Median length10
Mean length11.46975426
Min length10

Characters and Unicode

Total characters2032234
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowStillImage
2nd rowStillImage
3rd rowStillImage
4th rowStillImage;StillImage
5th rowStillImage
ValueCountFrequency (%)
stillimage 155884
88.0%
stillimage;stillimage 19435
 
11.0%
stillimage;stillimage;stillimage 1484
 
0.8%
stillimage;stillimage;stillimage;stillimage 290
 
0.2%
stillimage;stillimage;stillimage;stillimage;stillimage 63
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 15
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 6
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 3
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 2
 
< 0.1%
2025-01-08T18:32:40.151668image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 401712
19.8%
S 200856
9.9%
t 200856
9.9%
i 200856
9.9%
I 200856
9.9%
m 200856
9.9%
a 200856
9.9%
g 200856
9.9%
e 200856
9.9%
; 23674
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1606848
79.1%
Uppercase Letter 401712
 
19.8%
Other Punctuation 23674
 
1.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 401712
25.0%
t 200856
12.5%
i 200856
12.5%
m 200856
12.5%
a 200856
12.5%
g 200856
12.5%
e 200856
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 200856
50.0%
I 200856
50.0%
Other Punctuation
ValueCountFrequency (%)
; 23674
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2008560
98.8%
Common 23674
 
1.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 401712
20.0%
S 200856
10.0%
t 200856
10.0%
i 200856
10.0%
I 200856
10.0%
m 200856
10.0%
a 200856
10.0%
g 200856
10.0%
e 200856
10.0%
Common
ValueCountFrequency (%)
; 23674
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2032234
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 401712
19.8%
S 200856
9.9%
t 200856
9.9%
i 200856
9.9%
I 200856
9.9%
m 200856
9.9%
a 200856
9.9%
g 200856
9.9%
e 200856
9.9%
; 23674
 
1.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:40.199318image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.440146036
Min length4

Characters and Unicode

Total characters828216
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowtrue
3rd rowtrue
4th rowtrue
5th rowfalse
ValueCountFrequency (%)
true 104429
56.0%
false 82100
44.0%
2025-01-08T18:32:40.297416image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 186529
22.5%
t 104429
12.6%
r 104429
12.6%
u 104429
12.6%
f 82100
9.9%
a 82100
9.9%
l 82100
9.9%
s 82100
9.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 828216
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 186529
22.5%
t 104429
12.6%
r 104429
12.6%
u 104429
12.6%
f 82100
9.9%
a 82100
9.9%
l 82100
9.9%
s 82100
9.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 828216
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 186529
22.5%
t 104429
12.6%
r 104429
12.6%
u 104429
12.6%
f 82100
9.9%
a 82100
9.9%
l 82100
9.9%
s 82100
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 828216
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 186529
22.5%
t 104429
12.6%
r 104429
12.6%
u 104429
12.6%
f 82100
9.9%
a 82100
9.9%
l 82100
9.9%
s 82100
9.9%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:40.341850image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.999555029
Min length4

Characters and Unicode

Total characters932562
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 186446
> 99.9%
true 83
 
< 0.1%
2025-01-08T18:32:40.439905image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 186529
20.0%
f 186446
20.0%
a 186446
20.0%
l 186446
20.0%
s 186446
20.0%
t 83
 
< 0.1%
r 83
 
< 0.1%
u 83
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 932562
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 186529
20.0%
f 186446
20.0%
a 186446
20.0%
l 186446
20.0%
s 186446
20.0%
t 83
 
< 0.1%
r 83
 
< 0.1%
u 83
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 932562
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 186529
20.0%
f 186446
20.0%
a 186446
20.0%
l 186446
20.0%
s 186446
20.0%
t 83
 
< 0.1%
r 83
 
< 0.1%
u 83
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 932562
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 186529
20.0%
f 186446
20.0%
a 186446
20.0%
l 186446
20.0%
s 186446
20.0%
t 83
 
< 0.1%
r 83
 
< 0.1%
u 83
 
< 0.1%
Distinct15722
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:40.626034image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.088613567
Min length1

Characters and Unicode

Total characters1135703
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6979 ?
Unique (%)3.7%

Sample

1st row2700991
2nd row3170096
3rd row2728062
4th row4276910
5th row6
ValueCountFrequency (%)
6 28377
 
15.2%
2721893 1378
 
0.7%
2651126 1339
 
0.7%
2650111 1163
 
0.6%
3196548 1155
 
0.6%
2650583 961
 
0.5%
2933951 736
 
0.4%
2651736 535
 
0.3%
2650888 527
 
0.3%
4277138 495
 
0.3%
Other values (15712) 149863
80.3%
2025-01-08T18:32:40.889365image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 179759
15.8%
6 140693
12.4%
7 120853
10.6%
5 113020
10.0%
3 112168
9.9%
8 106777
9.4%
1 105406
9.3%
0 93031
8.2%
9 84742
7.5%
4 79254
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1135703
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 179759
15.8%
6 140693
12.4%
7 120853
10.6%
5 113020
10.0%
3 112168
9.9%
8 106777
9.4%
1 105406
9.3%
0 93031
8.2%
9 84742
7.5%
4 79254
7.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1135703
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 179759
15.8%
6 140693
12.4%
7 120853
10.6%
5 113020
10.0%
3 112168
9.9%
8 106777
9.4%
1 105406
9.3%
0 93031
8.2%
9 84742
7.5%
4 79254
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1135703
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 179759
15.8%
6 140693
12.4%
7 120853
10.6%
5 113020
10.0%
3 112168
9.9%
8 106777
9.4%
1 105406
9.3%
0 93031
8.2%
9 84742
7.5%
4 79254
7.0%
Distinct13242
Distinct (%)7.1%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-08T18:32:41.092499image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.097645715
Min length1

Characters and Unicode

Total characters1137278
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5385 ?
Unique (%)2.9%

Sample

1st row2700991
2nd row3170096
3rd row2728060
4th row4276910
5th row6
ValueCountFrequency (%)
6 28377
 
15.2%
2721893 1395
 
0.7%
2651126 1343
 
0.7%
2650111 1163
 
0.6%
3196548 1155
 
0.6%
2650583 1063
 
0.6%
2933951 736
 
0.4%
2651736 535
 
0.3%
2650888 527
 
0.3%
2689220 495
 
0.3%
Other values (13232) 149722
80.3%
2025-01-08T18:32:41.354166image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 182570
16.1%
6 143478
12.6%
7 116217
10.2%
8 112612
9.9%
5 110350
9.7%
3 109965
9.7%
1 107204
9.4%
0 94336
8.3%
9 88487
7.8%
4 72059
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1137278
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 182570
16.1%
6 143478
12.6%
7 116217
10.2%
8 112612
9.9%
5 110350
9.7%
3 109965
9.7%
1 107204
9.4%
0 94336
8.3%
9 88487
7.8%
4 72059
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
Common 1137278
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 182570
16.1%
6 143478
12.6%
7 116217
10.2%
8 112612
9.9%
5 110350
9.7%
3 109965
9.7%
1 107204
9.4%
0 94336
8.3%
9 88487
7.8%
4 72059
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1137278
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 182570
16.1%
6 143478
12.6%
7 116217
10.2%
8 112612
9.9%
5 110350
9.7%
3 109965
9.7%
1 107204
9.4%
0 94336
8.3%
9 88487
7.8%
4 72059
 
6.3%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:41.414673image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters186529
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row6
2nd row6
3rd row6
4th row6
5th row6
ValueCountFrequency (%)
6 177496
95.2%
5 5161
 
2.8%
4 2981
 
1.6%
3 869
 
0.5%
0 18
 
< 0.1%
1 2
 
< 0.1%
7 2
 
< 0.1%
2025-01-08T18:32:41.510133image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 177496
95.2%
5 5161
 
2.8%
4 2981
 
1.6%
3 869
 
0.5%
0 18
 
< 0.1%
1 2
 
< 0.1%
7 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 186529
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 177496
95.2%
5 5161
 
2.8%
4 2981
 
1.6%
3 869
 
0.5%
0 18
 
< 0.1%
1 2
 
< 0.1%
7 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 186529
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 177496
95.2%
5 5161
 
2.8%
4 2981
 
1.6%
3 869
 
0.5%
0 18
 
< 0.1%
1 2
 
< 0.1%
7 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 186529
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 177496
95.2%
5 5161
 
2.8%
4 2981
 
1.6%
3 869
 
0.5%
0 18
 
< 0.1%
1 2
 
< 0.1%
7 2
 
< 0.1%

phylumKey
Text

Missing 

Distinct17
Distinct (%)< 0.1%
Missing28431
Missing (%)15.2%
Memory size1.4 MiB
2025-01-08T18:32:41.558501image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length5.208237928
Min length1

Characters and Unicode

Total characters823412
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row7707728
2nd row7707728
3rd row7707728
4th row9
5th row7707728
ValueCountFrequency (%)
7707728 104064
65.8%
9 21776
 
13.8%
35 14896
 
9.4%
106 5566
 
3.5%
95 5121
 
3.2%
98 2980
 
1.9%
36 1763
 
1.1%
68 867
 
0.5%
7819616 620
 
0.4%
13 428
 
0.3%
Other values (7) 17
 
< 0.1%
2025-01-08T18:32:41.664335image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 416878
50.6%
0 109631
 
13.3%
8 108533
 
13.2%
2 104066
 
12.6%
9 30498
 
3.7%
5 20020
 
2.4%
3 17099
 
2.1%
6 9438
 
1.1%
1 7238
 
0.9%
4 11
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 823412
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 416878
50.6%
0 109631
 
13.3%
8 108533
 
13.2%
2 104066
 
12.6%
9 30498
 
3.7%
5 20020
 
2.4%
3 17099
 
2.1%
6 9438
 
1.1%
1 7238
 
0.9%
4 11
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 823412
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 416878
50.6%
0 109631
 
13.3%
8 108533
 
13.2%
2 104066
 
12.6%
9 30498
 
3.7%
5 20020
 
2.4%
3 17099
 
2.1%
6 9438
 
1.1%
1 7238
 
0.9%
4 11
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 823412
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 416878
50.6%
0 109631
 
13.3%
8 108533
 
13.2%
2 104066
 
12.6%
9 30498
 
3.7%
5 20020
 
2.4%
3 17099
 
2.1%
6 9438
 
1.1%
1 7238
 
0.9%
4 11
 
< 0.1%

classKey
Text

Missing 

Distinct49
Distinct (%)< 0.1%
Missing28457
Missing (%)15.3%
Memory size1.4 MiB
2025-01-08T18:32:41.727277image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.594324105
Min length3

Characters and Unicode

Total characters568162
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st row196
2nd row220
3rd row196
4th row126
5th row220
ValueCountFrequency (%)
220 47673
30.2%
196 34102
21.6%
126 19329
12.2%
7228684 18403
 
11.6%
327 11812
 
7.5%
342 5407
 
3.4%
180 4619
 
2.9%
7073593 2903
 
1.8%
125 2447
 
1.5%
190 2360
 
1.5%
Other values (39) 9017
 
5.7%
2025-01-08T18:32:41.847785image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 176119
31.0%
6 73133
12.9%
1 69453
 
12.2%
0 58587
 
10.3%
9 43889
 
7.7%
8 42426
 
7.5%
7 38983
 
6.9%
4 28699
 
5.1%
3 26684
 
4.7%
5 10189
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 568162
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 176119
31.0%
6 73133
12.9%
1 69453
 
12.2%
0 58587
 
10.3%
9 43889
 
7.7%
8 42426
 
7.5%
7 38983
 
6.9%
4 28699
 
5.1%
3 26684
 
4.7%
5 10189
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Common 568162
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 176119
31.0%
6 73133
12.9%
1 69453
 
12.2%
0 58587
 
10.3%
9 43889
 
7.7%
8 42426
 
7.5%
7 38983
 
6.9%
4 28699
 
5.1%
3 26684
 
4.7%
5 10189
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 568162
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 176119
31.0%
6 73133
12.9%
1 69453
 
12.2%
0 58587
 
10.3%
9 43889
 
7.7%
8 42426
 
7.5%
7 38983
 
6.9%
4 28699
 
5.1%
3 26684
 
4.7%
5 10189
 
1.8%

orderKey
Text

Missing 

Distinct249
Distinct (%)0.2%
Missing28496
Missing (%)15.3%
Memory size1.4 MiB
2025-01-08T18:32:42.020674image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.62612872
Min length3

Characters and Unicode

Total characters573048
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)< 0.1%

Sample

1st row1369
2nd row412
3rd row1369
4th row381
5th row408
ValueCountFrequency (%)
1369 23133
 
14.6%
392 14202
 
9.0%
381 11845
 
7.5%
414 7806
 
4.9%
1169 5708
 
3.6%
617 5685
 
3.6%
1370 5474
 
3.5%
377 5373
 
3.4%
408 4708
 
3.0%
691 3883
 
2.5%
Other values (239) 70216
44.4%
2025-01-08T18:32:42.261434image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 120264
21.0%
3 93801
16.4%
9 66033
11.5%
6 64887
11.3%
4 53476
9.3%
2 51228
8.9%
7 38365
 
6.7%
0 29712
 
5.2%
8 27679
 
4.8%
5 27603
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 573048
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 120264
21.0%
3 93801
16.4%
9 66033
11.5%
6 64887
11.3%
4 53476
9.3%
2 51228
8.9%
7 38365
 
6.7%
0 29712
 
5.2%
8 27679
 
4.8%
5 27603
 
4.8%

Most occurring scripts

ValueCountFrequency (%)
Common 573048
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 120264
21.0%
3 93801
16.4%
9 66033
11.5%
6 64887
11.3%
4 53476
9.3%
2 51228
8.9%
7 38365
 
6.7%
0 29712
 
5.2%
8 27679
 
4.8%
5 27603
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 573048
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 120264
21.0%
3 93801
16.4%
9 66033
11.5%
6 64887
11.3%
4 53476
9.3%
2 51228
8.9%
7 38365
 
6.7%
0 29712
 
5.2%
8 27679
 
4.8%
5 27603
 
4.8%

familyKey
Text

Missing 

Distinct815
Distinct (%)0.5%
Missing28710
Missing (%)15.4%
Memory size1.4 MiB
2025-01-08T18:32:42.452355image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length4
Mean length4.19253702
Min length4

Characters and Unicode

Total characters661662
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)0.1%

Sample

1st row5353
2nd row2503
3rd row7708
4th row6134
5th row4194986
ValueCountFrequency (%)
7708 13776
 
8.7%
3065 7289
 
4.6%
3073 6277
 
4.0%
5386 4763
 
3.0%
7689 3466
 
2.2%
2373 3452
 
2.2%
5015 3290
 
2.1%
2367 3008
 
1.9%
4673 2360
 
1.5%
5353 2219
 
1.4%
Other values (805) 107919
68.4%
2025-01-08T18:32:42.721118image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 97237
14.7%
7 89007
13.5%
2 83824
12.7%
3 80434
12.2%
8 66968
10.1%
0 57302
8.7%
5 50492
7.6%
4 50488
7.6%
1 47343
7.2%
9 38567
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 661662
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 97237
14.7%
7 89007
13.5%
2 83824
12.7%
3 80434
12.2%
8 66968
10.1%
0 57302
8.7%
5 50492
7.6%
4 50488
7.6%
1 47343
7.2%
9 38567
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
Common 661662
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 97237
14.7%
7 89007
13.5%
2 83824
12.7%
3 80434
12.2%
8 66968
10.1%
0 57302
8.7%
5 50492
7.6%
4 50488
7.6%
1 47343
7.2%
9 38567
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 661662
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 97237
14.7%
7 89007
13.5%
2 83824
12.7%
3 80434
12.2%
8 66968
10.1%
0 57302
8.7%
5 50492
7.6%
4 50488
7.6%
1 47343
7.2%
9 38567
 
5.8%

genusKey
Text

Missing 

Distinct4124
Distinct (%)2.6%
Missing28788
Missing (%)15.4%
Memory size1.4 MiB
2025-01-08T18:32:42.935444image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.016438339
Min length7

Characters and Unicode

Total characters1106780
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1062 ?
Unique (%)0.7%

Sample

1st row2700604
2nd row3170037
3rd row2721893
4th row4276909
5th row6008574
ValueCountFrequency (%)
2721893 8831
 
5.6%
2668958 2360
 
1.5%
2651126 2269
 
1.4%
2701072 1784
 
1.1%
2688736 1708
 
1.1%
2650583 1705
 
1.1%
2689215 1518
 
1.0%
3196548 1505
 
1.0%
2650111 1375
 
0.9%
2874237 1213
 
0.8%
Other values (4114) 133473
84.6%
2025-01-08T18:32:43.206615image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 187980
17.0%
8 133207
12.0%
6 127549
11.5%
3 111156
10.0%
7 108846
9.8%
1 100782
9.1%
9 99825
9.0%
5 90092
8.1%
0 89046
8.0%
4 58297
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1106780
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 187980
17.0%
8 133207
12.0%
6 127549
11.5%
3 111156
10.0%
7 108846
9.8%
1 100782
9.1%
9 99825
9.0%
5 90092
8.1%
0 89046
8.0%
4 58297
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Common 1106780
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 187980
17.0%
8 133207
12.0%
6 127549
11.5%
3 111156
10.0%
7 108846
9.8%
1 100782
9.1%
9 99825
9.0%
5 90092
8.1%
0 89046
8.0%
4 58297
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1106780
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 187980
17.0%
8 133207
12.0%
6 127549
11.5%
3 111156
10.0%
7 108846
9.8%
1 100782
9.1%
9 99825
9.0%
5 90092
8.1%
0 89046
8.0%
4 58297
 
5.3%

speciesKey
Text

Missing 

Distinct11415
Distinct (%)8.6%
Missing54335
Missing (%)29.1%
Memory size1.4 MiB
2025-01-08T18:32:43.423969image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.020250541
Min length7

Characters and Unicode

Total characters928035
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4601 ?
Unique (%)3.5%

Sample

1st row2700991
2nd row3170096
3rd row2728060
4th row4276910
5th row6070732
ValueCountFrequency (%)
2689220 495
 
0.4%
4276980 477
 
0.4%
2689218 392
 
0.3%
4276912 379
 
0.3%
2689212 359
 
0.3%
5710205 342
 
0.3%
2689327 337
 
0.3%
2688707 333
 
0.3%
2688970 318
 
0.2%
5286325 288
 
0.2%
Other values (11405) 128474
97.2%
2025-01-08T18:32:43.702483image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 149599
16.1%
7 99507
10.7%
8 97422
10.5%
3 95610
10.3%
5 89780
9.7%
6 89662
9.7%
1 86682
9.3%
0 79003
8.5%
9 76358
8.2%
4 64412
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 928035
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 149599
16.1%
7 99507
10.7%
8 97422
10.5%
3 95610
10.3%
5 89780
9.7%
6 89662
9.7%
1 86682
9.3%
0 79003
8.5%
9 76358
8.2%
4 64412
6.9%

Most occurring scripts

ValueCountFrequency (%)
Common 928035
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 149599
16.1%
7 99507
10.7%
8 97422
10.5%
3 95610
10.3%
5 89780
9.7%
6 89662
9.7%
1 86682
9.3%
0 79003
8.5%
9 76358
8.2%
4 64412
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 928035
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 149599
16.1%
7 99507
10.7%
8 97422
10.5%
3 95610
10.3%
5 89780
9.7%
6 89662
9.7%
1 86682
9.3%
0 79003
8.5%
9 76358
8.2%
4 64412
6.9%

species
Text

Missing 

Distinct11401
Distinct (%)8.6%
Missing54335
Missing (%)29.1%
Memory size1.4 MiB
2025-01-08T18:32:43.909225image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length30
Mean length18.93441457
Min length8

Characters and Unicode

Total characters2503016
Distinct characters54
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4591 ?
Unique (%)3.5%

Sample

1st rowLuzula bulbosa
2nd rowGentiana clausa
3rd rowCarex vulpinoidea
4th rowLophocolea minor
5th rowMimulus ringens
ValueCountFrequency (%)
carex 7436
 
2.8%
sphagnum 2358
 
0.9%
frullania 1702
 
0.6%
canadensis 1589
 
0.6%
scapania 1515
 
0.6%
juncus 1326
 
0.5%
viola 1196
 
0.5%
viburnum 1126
 
0.4%
dichanthelium 996
 
0.4%
cyperus 990
 
0.4%
Other values (9729) 244228
92.3%
2025-01-08T18:32:44.227156image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 299823
 
12.0%
i 240977
 
9.6%
e 162367
 
6.5%
l 155318
 
6.2%
r 153244
 
6.1%
u 145277
 
5.8%
o 142900
 
5.7%
s 139597
 
5.6%
132268
 
5.3%
n 131280
 
5.2%
Other values (44) 799965
32.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2237753
89.4%
Space Separator 132268
 
5.3%
Uppercase Letter 132194
 
5.3%
Dash Punctuation 801
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 299823
13.4%
i 240977
10.8%
e 162367
 
7.3%
l 155318
 
6.9%
r 153244
 
6.8%
u 145277
 
6.5%
o 142900
 
6.4%
s 139597
 
6.2%
n 131280
 
5.9%
t 108642
 
4.9%
Other values (16) 558328
25.0%
Uppercase Letter
ValueCountFrequency (%)
C 21176
16.0%
S 16784
12.7%
P 16487
12.5%
A 9762
 
7.4%
L 7739
 
5.9%
D 6545
 
5.0%
R 6344
 
4.8%
M 5621
 
4.3%
E 5440
 
4.1%
B 5064
 
3.8%
Other values (16) 31232
23.6%
Space Separator
ValueCountFrequency (%)
132268
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 801
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2369947
94.7%
Common 133069
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 299823
12.7%
i 240977
 
10.2%
e 162367
 
6.9%
l 155318
 
6.6%
r 153244
 
6.5%
u 145277
 
6.1%
o 142900
 
6.0%
s 139597
 
5.9%
n 131280
 
5.5%
t 108642
 
4.6%
Other values (42) 690522
29.1%
Common
ValueCountFrequency (%)
132268
99.4%
- 801
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2503016
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 299823
 
12.0%
i 240977
 
9.6%
e 162367
 
6.5%
l 155318
 
6.2%
r 153244
 
6.1%
u 145277
 
5.8%
o 142900
 
5.7%
s 139597
 
5.6%
132268
 
5.3%
n 131280
 
5.2%
Other values (44) 799965
32.0%
Distinct13241
Distinct (%)7.1%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
2025-01-08T18:32:44.430013image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length122
Median length81
Mean length25.88063439
Min length5

Characters and Unicode

Total characters4827023
Distinct characters110
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5383 ?
Unique (%)2.9%

Sample

1st rowLuzula bulbosa (Alph.Wood) Smyth & L.C.R.Smyth
2nd rowGentiana clausa Raf.
3rd rowCarex vulpinoidea Michx.
4th rowLophocolea minor Nees
5th rowPlantae
ValueCountFrequency (%)
l 53323
 
8.7%
plantae 28377
 
4.7%
14536
 
2.4%
ex 10569
 
1.7%
carex 8831
 
1.4%
hedw 6411
 
1.1%
willd 5097
 
0.8%
dumort 4794
 
0.8%
michx 4667
 
0.8%
subsp 3140
 
0.5%
Other values (13765) 470434
77.1%
2025-01-08T18:32:44.700083image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 449227
 
9.3%
423668
 
8.8%
i 320087
 
6.6%
e 303237
 
6.3%
l 260082
 
5.4%
r 247172
 
5.1%
n 228432
 
4.7%
. 219057
 
4.5%
o 211700
 
4.4%
s 202864
 
4.2%
Other values (100) 1961497
40.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3519415
72.9%
Uppercase Letter 491763
 
10.2%
Space Separator 423668
 
8.8%
Other Punctuation 241072
 
5.0%
Open Punctuation 66474
 
1.4%
Close Punctuation 66474
 
1.4%
Decimal Number 15952
 
0.3%
Dash Punctuation 1598
 
< 0.1%
Math Symbol 598
 
< 0.1%
Connector Punctuation 9
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 449227
12.8%
i 320087
 
9.1%
e 303237
 
8.6%
l 260082
 
7.4%
r 247172
 
7.0%
n 228432
 
6.5%
o 211700
 
6.0%
s 202864
 
5.8%
t 199790
 
5.7%
u 197691
 
5.6%
Other values (48) 899133
25.5%
Uppercase Letter
ValueCountFrequency (%)
L 78990
16.1%
P 58272
11.8%
S 47010
 
9.6%
C 37377
 
7.6%
A 31917
 
6.5%
M 26946
 
5.5%
H 26372
 
5.4%
B 22862
 
4.6%
D 22849
 
4.6%
R 19557
 
4.0%
Other values (22) 119611
24.3%
Decimal Number
ValueCountFrequency (%)
1 4467
28.0%
8 3584
22.5%
2 2003
12.6%
0 1837
11.5%
9 1077
 
6.8%
3 852
 
5.3%
4 733
 
4.6%
7 540
 
3.4%
5 447
 
2.8%
6 412
 
2.6%
Other Punctuation
ValueCountFrequency (%)
. 219057
90.9%
& 14536
 
6.0%
, 7310
 
3.0%
' 169
 
0.1%
Space Separator
ValueCountFrequency (%)
423668
100.0%
Open Punctuation
ValueCountFrequency (%)
( 66474
100.0%
Close Punctuation
ValueCountFrequency (%)
) 66474
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1598
100.0%
Math Symbol
ValueCountFrequency (%)
× 598
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4011178
83.1%
Common 815845
 
16.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 449227
 
11.2%
i 320087
 
8.0%
e 303237
 
7.6%
l 260082
 
6.5%
r 247172
 
6.2%
n 228432
 
5.7%
o 211700
 
5.3%
s 202864
 
5.1%
t 199790
 
5.0%
u 197691
 
4.9%
Other values (80) 1390896
34.7%
Common
ValueCountFrequency (%)
423668
51.9%
. 219057
26.9%
( 66474
 
8.1%
) 66474
 
8.1%
& 14536
 
1.8%
, 7310
 
0.9%
1 4467
 
0.5%
8 3584
 
0.4%
2 2003
 
0.2%
0 1837
 
0.2%
Other values (10) 6435
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4815169
99.8%
None 11854
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 449227
 
9.3%
423668
 
8.8%
i 320087
 
6.6%
e 303237
 
6.3%
l 260082
 
5.4%
r 247172
 
5.1%
n 228432
 
4.7%
. 219057
 
4.5%
o 211700
 
4.4%
s 202864
 
4.2%
Other values (61) 1949643
40.5%
None
ValueCountFrequency (%)
ü 2478
20.9%
ö 2362
19.9%
á 1752
14.8%
ň 1314
11.1%
ä 966
 
8.1%
é 722
 
6.1%
× 598
 
5.0%
Á 349
 
2.9%
ø 272
 
2.3%
Å 263
 
2.2%
Other values (29) 778
 
6.6%
Distinct16379
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:44.899556image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length50
Median length43
Mean length15.95681637
Min length3

Characters and Unicode

Total characters2976409
Distinct characters58
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7549 ?
Unique (%)4.0%

Sample

1st rowLuzula bulbosa
2nd rowGentiana clausa
3rd rowCarex muhlenbergii
4th rowLophocolea minor
5th rowPlantae
ValueCountFrequency (%)
plantae 28374
 
8.6%
carex 8803
 
2.7%
var 3699
 
1.1%
dryopteris 2392
 
0.7%
sphagnum 2360
 
0.7%
juncus 1814
 
0.5%
frullania 1708
 
0.5%
asplenium 1557
 
0.5%
scapania 1517
 
0.5%
canadensis 1515
 
0.5%
Other values (11105) 276305
83.7%
2025-01-08T18:32:45.264856image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 389969
13.1%
i 262458
 
8.8%
e 205819
 
6.9%
l 196972
 
6.6%
r 175234
 
5.9%
n 170772
 
5.7%
u 162576
 
5.5%
o 156926
 
5.3%
s 154857
 
5.2%
t 146008
 
4.9%
Other values (48) 954818
32.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2641527
88.7%
Uppercase Letter 186514
 
6.3%
Space Separator 143515
 
4.8%
Other Punctuation 4146
 
0.1%
Dash Punctuation 705
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 389969
14.8%
i 262458
9.9%
e 205819
 
7.8%
l 196972
 
7.5%
r 175234
 
6.6%
n 170772
 
6.5%
u 162576
 
6.2%
o 156926
 
5.9%
s 154857
 
5.9%
t 146008
 
5.5%
Other values (16) 619936
23.5%
Uppercase Letter
ValueCountFrequency (%)
P 49351
26.5%
C 26040
14.0%
S 16978
 
9.1%
A 13951
 
7.5%
L 10862
 
5.8%
D 7781
 
4.2%
R 6989
 
3.7%
E 6742
 
3.6%
B 6396
 
3.4%
M 6026
 
3.2%
Other values (16) 35398
19.0%
Other Punctuation
ValueCountFrequency (%)
. 4144
> 99.9%
? 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
143515
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 705
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2828041
95.0%
Common 148368
 
5.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 389969
13.8%
i 262458
 
9.3%
e 205819
 
7.3%
l 196972
 
7.0%
r 175234
 
6.2%
n 170772
 
6.0%
u 162576
 
5.7%
o 156926
 
5.5%
s 154857
 
5.5%
t 146008
 
5.2%
Other values (42) 806450
28.5%
Common
ValueCountFrequency (%)
143515
96.7%
. 4144
 
2.8%
- 705
 
0.5%
? 2
 
< 0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2976409
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 389969
13.1%
i 262458
 
8.8%
e 205819
 
6.9%
l 196972
 
6.6%
r 175234
 
5.9%
n 170772
 
5.7%
u 162576
 
5.5%
o 156926
 
5.3%
s 154857
 
5.2%
t 146008
 
4.9%
Other values (48) 954818
32.1%

protocol
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:45.318957image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters559587
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEML
2nd rowEML
3rd rowEML
4th rowEML
5th rowEML
ValueCountFrequency (%)
eml 186529
100.0%
2025-01-08T18:32:45.409801image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 186529
33.3%
M 186529
33.3%
L 186529
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 559587
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 186529
33.3%
M 186529
33.3%
L 186529
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 559587
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 186529
33.3%
M 186529
33.3%
L 186529
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 559587
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 186529
33.3%
M 186529
33.3%
L 186529
33.3%
Distinct18062
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:45.510155image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99661179
Min length20

Characters and Unicode

Total characters4476064
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1510 ?
Unique (%)0.8%

Sample

1st row2025-01-07T13:09:13.317Z
2nd row2025-01-07T13:09:13.317Z
3rd row2025-01-07T13:09:13.317Z
4th row2025-01-07T13:09:13.317Z
5th row2025-01-07T13:09:13.318Z
ValueCountFrequency (%)
2025-01-07t13:09:09.691z 61
 
< 0.1%
2025-01-07t13:09:14.842z 55
 
< 0.1%
2025-01-07t13:09:13.153z 54
 
< 0.1%
2025-01-07t13:09:14.841z 53
 
< 0.1%
2025-01-07t13:09:14.268z 52
 
< 0.1%
2025-01-07t13:09:10.471z 51
 
< 0.1%
2025-01-07t13:09:13.231z 51
 
< 0.1%
2025-01-07t13:09:13.898z 51
 
< 0.1%
2025-01-07t13:09:12.942z 51
 
< 0.1%
2025-01-07t13:09:14.350z 50
 
< 0.1%
Other values (18052) 186000
99.7%
2025-01-08T18:32:45.686704image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 902065
20.2%
1 544038
12.2%
2 450890
10.1%
- 373058
8.3%
: 373058
8.3%
3 270263
 
6.0%
5 267401
 
6.0%
7 257877
 
5.8%
9 254149
 
5.7%
T 186529
 
4.2%
Other values (5) 596736
13.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3170519
70.8%
Other Punctuation 559429
 
12.5%
Dash Punctuation 373058
 
8.3%
Uppercase Letter 373058
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 902065
28.5%
1 544038
17.2%
2 450890
14.2%
3 270263
 
8.5%
5 267401
 
8.4%
7 257877
 
8.1%
9 254149
 
8.0%
4 80989
 
2.6%
8 73891
 
2.3%
6 68956
 
2.2%
Other Punctuation
ValueCountFrequency (%)
: 373058
66.7%
. 186371
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 186529
50.0%
Z 186529
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 373058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4103006
91.7%
Latin 373058
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 902065
22.0%
1 544038
13.3%
2 450890
11.0%
- 373058
9.1%
: 373058
9.1%
3 270263
 
6.6%
5 267401
 
6.5%
7 257877
 
6.3%
9 254149
 
6.2%
. 186371
 
4.5%
Other values (3) 223836
 
5.5%
Latin
ValueCountFrequency (%)
T 186529
50.0%
Z 186529
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4476064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 902065
20.2%
1 544038
12.2%
2 450890
10.1%
- 373058
8.3%
: 373058
8.3%
3 270263
 
6.0%
5 267401
 
6.0%
7 257877
 
5.8%
9 254149
 
5.7%
T 186529
 
4.2%
Other values (5) 596736
13.3%

lastCrawled
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:45.746489image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters4476696
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2025-01-07T13:01:58.967Z
2nd row2025-01-07T13:01:58.967Z
3rd row2025-01-07T13:01:58.967Z
4th row2025-01-07T13:01:58.967Z
5th row2025-01-07T13:01:58.967Z
ValueCountFrequency (%)
2025-01-07t13:01:58.967z 186529
100.0%
2025-01-08T18:32:45.847585image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 746116
16.7%
1 559587
12.5%
2 373058
8.3%
5 373058
8.3%
- 373058
8.3%
7 373058
8.3%
: 373058
8.3%
T 186529
 
4.2%
3 186529
 
4.2%
8 186529
 
4.2%
Other values (4) 746116
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3170993
70.8%
Other Punctuation 559587
 
12.5%
Dash Punctuation 373058
 
8.3%
Uppercase Letter 373058
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 746116
23.5%
1 559587
17.6%
2 373058
11.8%
5 373058
11.8%
7 373058
11.8%
3 186529
 
5.9%
8 186529
 
5.9%
9 186529
 
5.9%
6 186529
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 373058
66.7%
. 186529
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 186529
50.0%
Z 186529
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 373058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4103638
91.7%
Latin 373058
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 746116
18.2%
1 559587
13.6%
2 373058
9.1%
5 373058
9.1%
- 373058
9.1%
7 373058
9.1%
: 373058
9.1%
3 186529
 
4.5%
8 186529
 
4.5%
. 186529
 
4.5%
Other values (2) 373058
9.1%
Latin
ValueCountFrequency (%)
T 186529
50.0%
Z 186529
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4476696
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 746116
16.7%
1 559587
12.5%
2 373058
8.3%
5 373058
8.3%
- 373058
8.3%
7 373058
8.3%
: 373058
8.3%
T 186529
 
4.2%
3 186529
 
4.2%
8 186529
 
4.2%
Other values (4) 746116
16.7%

repatriated
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing72482
Missing (%)38.9%
Memory size1.4 MiB
2025-01-08T18:32:45.888586image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.860750392
Min length4

Characters and Unicode

Total characters554354
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowtrue
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 98166
86.1%
true 15881
 
13.9%
2025-01-08T18:32:45.982944image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 114047
20.6%
f 98166
17.7%
a 98166
17.7%
l 98166
17.7%
s 98166
17.7%
t 15881
 
2.9%
r 15881
 
2.9%
u 15881
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 554354
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 114047
20.6%
f 98166
17.7%
a 98166
17.7%
l 98166
17.7%
s 98166
17.7%
t 15881
 
2.9%
r 15881
 
2.9%
u 15881
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 554354
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 114047
20.6%
f 98166
17.7%
a 98166
17.7%
l 98166
17.7%
s 98166
17.7%
t 15881
 
2.9%
r 15881
 
2.9%
u 15881
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 554354
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 114047
20.6%
f 98166
17.7%
a 98166
17.7%
l 98166
17.7%
s 98166
17.7%
t 15881
 
2.9%
r 15881
 
2.9%
u 15881
 
2.9%

isSequenced
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:46.021452image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters932645
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 186529
100.0%
2025-01-08T18:32:46.114757image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 186529
20.0%
a 186529
20.0%
l 186529
20.0%
s 186529
20.0%
e 186529
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 932645
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 186529
20.0%
a 186529
20.0%
l 186529
20.0%
s 186529
20.0%
e 186529
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 932645
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 186529
20.0%
a 186529
20.0%
l 186529
20.0%
s 186529
20.0%
e 186529
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 932645
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 186529
20.0%
a 186529
20.0%
l 186529
20.0%
s 186529
20.0%
e 186529
20.0%

gbifRegion
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing72484
Missing (%)38.9%
Memory size1.4 MiB
2025-01-08T18:32:46.163075image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.75279933
Min length4

Characters and Unicode

Total characters1454393
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 104594
91.7%
latin_america 5603
 
4.9%
europe 1857
 
1.6%
asia 998
 
0.9%
oceania 680
 
0.6%
africa 298
 
0.3%
antarctica 15
 
< 0.1%
2025-01-08T18:32:46.266484image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 229994
15.8%
R 216961
14.9%
I 117791
8.1%
E 114591
7.9%
C 111205
7.6%
N 110892
7.6%
T 110227
7.6%
_ 110197
7.6%
M 110197
7.6%
O 107131
7.4%
Other values (6) 115207
7.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1344196
92.4%
Connector Punctuation 110197
 
7.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 229994
17.1%
R 216961
16.1%
I 117791
8.8%
E 114591
8.5%
C 111205
8.3%
N 110892
8.2%
T 110227
8.2%
M 110197
8.2%
O 107131
8.0%
H 104594
7.8%
Other values (5) 10613
 
0.8%
Connector Punctuation
ValueCountFrequency (%)
_ 110197
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1344196
92.4%
Common 110197
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 229994
17.1%
R 216961
16.1%
I 117791
8.8%
E 114591
8.5%
C 111205
8.3%
N 110892
8.2%
T 110227
8.2%
M 110197
8.2%
O 107131
8.0%
H 104594
7.8%
Other values (5) 10613
 
0.8%
Common
ValueCountFrequency (%)
_ 110197
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1454393
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 229994
15.8%
R 216961
14.9%
I 117791
8.1%
E 114591
7.9%
C 111205
7.6%
N 110892
7.6%
T 110227
7.6%
_ 110197
7.6%
M 110197
7.6%
O 107131
7.4%
Other values (6) 115207
7.9%

publishedByGbifRegion
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2025-01-08T18:32:46.312635image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters2424877
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 186529
100.0%
2025-01-08T18:32:46.406890image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 373058
15.4%
A 373058
15.4%
N 186529
7.7%
O 186529
7.7%
T 186529
7.7%
H 186529
7.7%
_ 186529
7.7%
M 186529
7.7%
E 186529
7.7%
I 186529
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2238348
92.3%
Connector Punctuation 186529
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 373058
16.7%
A 373058
16.7%
N 186529
8.3%
O 186529
8.3%
T 186529
8.3%
H 186529
8.3%
M 186529
8.3%
E 186529
8.3%
I 186529
8.3%
C 186529
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 186529
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2238348
92.3%
Common 186529
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 373058
16.7%
A 373058
16.7%
N 186529
8.3%
O 186529
8.3%
T 186529
8.3%
H 186529
8.3%
M 186529
8.3%
E 186529
8.3%
I 186529
8.3%
C 186529
8.3%
Common
ValueCountFrequency (%)
_ 186529
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2424877
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 373058
15.4%
A 373058
15.4%
N 186529
7.7%
O 186529
7.7%
T 186529
7.7%
H 186529
7.7%
_ 186529
7.7%
M 186529
7.7%
E 186529
7.7%
I 186529
7.7%

level0Gid
Text

Missing 

Distinct77
Distinct (%)0.1%
Missing86228
Missing (%)46.2%
Memory size1.4 MiB
2025-01-08T18:32:46.483595image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters300903
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowUSA
2nd rowUSA
3rd rowCAN
4th rowUSA
5th rowUSA
ValueCountFrequency (%)
usa 89645
89.4%
can 5409
 
5.4%
mex 903
 
0.9%
pri 834
 
0.8%
chn 726
 
0.7%
gbr 460
 
0.5%
bmu 334
 
0.3%
fra 241
 
0.2%
ecu 222
 
0.2%
bhs 181
 
0.2%
Other values (67) 1346
 
1.3%
2025-01-08T18:32:46.612260image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 95752
31.8%
U 90542
30.1%
S 90036
29.9%
C 6542
 
2.2%
N 6422
 
2.1%
R 1743
 
0.6%
M 1438
 
0.5%
E 1293
 
0.4%
B 1135
 
0.4%
P 1073
 
0.4%
Other values (16) 4927
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 300903
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 95752
31.8%
U 90542
30.1%
S 90036
29.9%
C 6542
 
2.2%
N 6422
 
2.1%
R 1743
 
0.6%
M 1438
 
0.5%
E 1293
 
0.4%
B 1135
 
0.4%
P 1073
 
0.4%
Other values (16) 4927
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 300903
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 95752
31.8%
U 90542
30.1%
S 90036
29.9%
C 6542
 
2.2%
N 6422
 
2.1%
R 1743
 
0.6%
M 1438
 
0.5%
E 1293
 
0.4%
B 1135
 
0.4%
P 1073
 
0.4%
Other values (16) 4927
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 300903
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 95752
31.8%
U 90542
30.1%
S 90036
29.9%
C 6542
 
2.2%
N 6422
 
2.1%
R 1743
 
0.6%
M 1438
 
0.5%
E 1293
 
0.4%
B 1135
 
0.4%
P 1073
 
0.4%
Other values (16) 4927
 
1.6%

level0Name
Text

Missing 

Distinct77
Distinct (%)0.1%
Missing86228
Missing (%)46.2%
Memory size1.4 MiB
2025-01-08T18:32:46.705351image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length27
Median length13
Mean length12.36356567
Min length4

Characters and Unicode

Total characters1240078
Distinct characters52
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowCanada
4th rowUnited States
5th rowUnited States
ValueCountFrequency (%)
united 90105
47.1%
states 89645
46.8%
canada 5409
 
2.8%
méxico 903
 
0.5%
puerto 834
 
0.4%
rico 834
 
0.4%
china 726
 
0.4%
kingdom 460
 
0.2%
bermuda 334
 
0.2%
france 241
 
0.1%
Other values (87) 1995
 
1.0%
2025-01-08T18:32:46.852526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 270627
21.8%
e 181844
14.7%
a 109780
8.9%
n 97629
 
7.9%
d 96873
 
7.8%
i 93968
 
7.6%
91185
 
7.4%
s 90321
 
7.3%
U 90116
 
7.3%
S 89755
 
7.2%
Other values (42) 27980
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 957389
77.2%
Uppercase Letter 191471
 
15.4%
Space Separator 91185
 
7.4%
Other Punctuation 33
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 270627
28.3%
e 181844
19.0%
a 109780
11.5%
n 97629
 
10.2%
d 96873
 
10.1%
i 93968
 
9.8%
s 90321
 
9.4%
o 3638
 
0.4%
c 2406
 
0.3%
r 2348
 
0.2%
Other values (17) 7955
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
U 90116
47.1%
S 89755
46.9%
C 6290
 
3.3%
M 992
 
0.5%
P 970
 
0.5%
R 861
 
0.4%
B 595
 
0.3%
K 475
 
0.2%
F 322
 
0.2%
A 263
 
0.1%
Other values (12) 832
 
0.4%
Other Punctuation
ValueCountFrequency (%)
. 22
66.7%
, 11
33.3%
Space Separator
ValueCountFrequency (%)
91185
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1148860
92.6%
Common 91218
 
7.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 270627
23.6%
e 181844
15.8%
a 109780
9.6%
n 97629
 
8.5%
d 96873
 
8.4%
i 93968
 
8.2%
s 90321
 
7.9%
U 90116
 
7.8%
S 89755
 
7.8%
C 6290
 
0.5%
Other values (39) 21657
 
1.9%
Common
ValueCountFrequency (%)
91185
> 99.9%
. 22
 
< 0.1%
, 11
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1239169
99.9%
None 909
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 270627
21.8%
e 181844
14.7%
a 109780
8.9%
n 97629
 
7.9%
d 96873
 
7.8%
i 93968
 
7.6%
91185
 
7.4%
s 90321
 
7.3%
U 90116
 
7.3%
S 89755
 
7.2%
Other values (40) 27071
 
2.2%
None
ValueCountFrequency (%)
é 903
99.3%
Å 6
 
0.7%

level1Gid
Text

Missing 

Distinct382
Distinct (%)0.4%
Missing86228
Missing (%)46.2%
Memory size1.4 MiB
2025-01-08T18:32:47.031359image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.281044057
Min length7

Characters and Unicode

Total characters730296
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique71 ?
Unique (%)0.1%

Sample

1st rowUSA.7_1
2nd rowUSA.7_1
3rd rowCAN.2_1
4th rowUSA.7_1
5th rowUSA.7_1
ValueCountFrequency (%)
usa.7_1 61491
61.3%
usa.23_1 2868
 
2.9%
usa.30_1 2549
 
2.5%
usa.5_1 2515
 
2.5%
usa.10_1 2189
 
2.2%
usa.22_1 1940
 
1.9%
usa.20_1 1921
 
1.9%
can.2_1 1690
 
1.7%
usa.33_1 1296
 
1.3%
usa.48_1 1264
 
1.3%
Other values (372) 20578
 
20.5%
2025-01-08T18:32:47.269894image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 111747
15.3%
. 100301
13.7%
_ 100293
13.7%
A 95752
13.1%
U 90542
12.4%
S 90036
12.3%
7 63831
8.7%
2 14786
 
2.0%
3 12782
 
1.8%
0 7852
 
1.1%
Other values (28) 42374
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 300927
41.2%
Decimal Number 228775
31.3%
Other Punctuation 100301
 
13.7%
Connector Punctuation 100293
 
13.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 95752
31.8%
U 90542
30.1%
S 90036
29.9%
C 6542
 
2.2%
N 6422
 
2.1%
R 1743
 
0.6%
M 1438
 
0.5%
E 1293
 
0.4%
B 1135
 
0.4%
P 1073
 
0.4%
Other values (16) 4951
 
1.6%
Decimal Number
ValueCountFrequency (%)
1 111747
48.8%
7 63831
27.9%
2 14786
 
6.5%
3 12782
 
5.6%
0 7852
 
3.4%
4 6773
 
3.0%
5 3908
 
1.7%
6 2536
 
1.1%
8 2307
 
1.0%
9 2253
 
1.0%
Other Punctuation
ValueCountFrequency (%)
. 100301
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 100293
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 429369
58.8%
Latin 300927
41.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 95752
31.8%
U 90542
30.1%
S 90036
29.9%
C 6542
 
2.2%
N 6422
 
2.1%
R 1743
 
0.6%
M 1438
 
0.5%
E 1293
 
0.4%
B 1135
 
0.4%
P 1073
 
0.4%
Other values (16) 4951
 
1.6%
Common
ValueCountFrequency (%)
1 111747
26.0%
. 100301
23.4%
_ 100293
23.4%
7 63831
14.9%
2 14786
 
3.4%
3 12782
 
3.0%
0 7852
 
1.8%
4 6773
 
1.6%
5 3908
 
0.9%
6 2536
 
0.6%
Other values (2) 4560
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 730296
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 111747
15.3%
. 100301
13.7%
_ 100293
13.7%
A 95752
13.1%
U 90542
12.4%
S 90036
12.3%
7 63831
8.7%
2 14786
 
2.0%
3 12782
 
1.8%
0 7852
 
1.1%
Other values (28) 42374
 
5.8%

level1Name
Text

Missing 

Distinct380
Distinct (%)0.4%
Missing86228
Missing (%)46.2%
Memory size1.4 MiB
2025-01-08T18:32:47.447701image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length11
Mean length10.37484173
Min length3

Characters and Unicode

Total characters1040607
Distinct characters74
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique71 ?
Unique (%)0.1%

Sample

1st rowConnecticut
2nd rowConnecticut
3rd rowBritish Columbia
4th rowConnecticut
5th rowConnecticut
ValueCountFrequency (%)
connecticut 61491
54.9%
new 4618
 
4.1%
michigan 2868
 
2.6%
hampshire 2549
 
2.3%
california 2536
 
2.3%
florida 2189
 
2.0%
massachusetts 1940
 
1.7%
maine 1921
 
1.7%
columbia 1815
 
1.6%
british 1690
 
1.5%
Other values (428) 28413
25.4%
2025-01-08T18:32:47.692539image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 148113
14.2%
t 137257
13.2%
c 132193
12.7%
i 98938
9.5%
o 86362
8.3%
e 83832
8.1%
u 69788
6.7%
C 67628
6.5%
a 45481
 
4.4%
r 20585
 
2.0%
Other values (64) 150430
14.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 916653
88.1%
Uppercase Letter 111765
 
10.7%
Space Separator 11729
 
1.1%
Dash Punctuation 329
 
< 0.1%
Other Punctuation 131
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 148113
16.2%
t 137257
15.0%
c 132193
14.4%
i 98938
10.8%
o 86362
9.4%
e 83832
9.1%
u 69788
7.6%
a 45481
 
5.0%
r 20585
 
2.2%
s 20401
 
2.2%
Other values (33) 73703
8.0%
Uppercase Letter
ValueCountFrequency (%)
C 67628
60.5%
M 8798
 
7.9%
N 7769
 
7.0%
H 4127
 
3.7%
F 2321
 
2.1%
S 2246
 
2.0%
B 2016
 
1.8%
W 2003
 
1.8%
A 1861
 
1.7%
V 1821
 
1.6%
Other values (17) 11175
 
10.0%
Other Punctuation
ValueCountFrequency (%)
' 127
96.9%
. 4
 
3.1%
Space Separator
ValueCountFrequency (%)
11729
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 329
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1028418
98.8%
Common 12189
 
1.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 148113
14.4%
t 137257
13.3%
c 132193
12.9%
i 98938
9.6%
o 86362
8.4%
e 83832
8.2%
u 69788
6.8%
C 67628
6.6%
a 45481
 
4.4%
r 20585
 
2.0%
Other values (60) 138241
13.4%
Common
ValueCountFrequency (%)
11729
96.2%
- 329
 
2.7%
' 127
 
1.0%
. 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1038649
99.8%
None 1957
 
0.2%
Latin Ext Additional 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 148113
14.3%
t 137257
13.2%
c 132193
12.7%
i 98938
9.5%
o 86362
8.3%
e 83832
8.1%
u 69788
6.7%
C 67628
6.5%
a 45481
 
4.4%
r 20585
 
2.0%
Other values (46) 148472
14.3%
None
ValueCountFrequency (%)
é 1146
58.6%
í 368
 
18.8%
á 92
 
4.7%
ô 78
 
4.0%
ö 76
 
3.9%
ó 62
 
3.2%
ü 48
 
2.5%
ã 24
 
1.2%
à 16
 
0.8%
ś 13
 
0.7%
Other values (7) 34
 
1.7%
Latin Ext Additional
ValueCountFrequency (%)
1
100.0%

level2Gid
Text

Missing 

Distinct1864
Distinct (%)1.9%
Missing87766
Missing (%)47.1%
Memory size1.4 MiB
2025-01-08T18:32:47.885331image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length9
Mean length9.519212661
Min length7

Characters and Unicode

Total characters940146
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique550 ?
Unique (%)0.6%

Sample

1st rowUSA.7.6_1
2nd rowUSA.7.5_1
3rd rowCAN.2.8_1
4th rowUSA.7.3_1
5th rowUSA.7.2_1
ValueCountFrequency (%)
usa.7.5_1 21687
22.0%
usa.7.2_1 10598
 
10.7%
usa.7.3_1 8951
 
9.1%
usa.7.1_1 6380
 
6.5%
usa.7.6_1 6092
 
6.2%
usa.7.4_1 4045
 
4.1%
usa.7.7_1 1936
 
2.0%
usa.7.8_1 1802
 
1.8%
usa.30.4_1 1119
 
1.1%
can.7.17_1 1051
 
1.1%
Other values (1854) 35102
35.5%
2025-01-08T18:32:48.131012image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 197518
21.0%
1 127750
13.6%
_ 98763
10.5%
A 95669
10.2%
U 90208
9.6%
S 89851
9.6%
7 70699
 
7.5%
2 33642
 
3.6%
5 32966
 
3.5%
3 28520
 
3.0%
Other values (28) 74560
 
7.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 347576
37.0%
Uppercase Letter 296289
31.5%
Other Punctuation 197518
21.0%
Connector Punctuation 98763
 
10.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 95669
32.3%
U 90208
30.4%
S 89851
30.3%
C 6509
 
2.2%
N 6400
 
2.2%
E 1270
 
0.4%
M 1029
 
0.3%
X 903
 
0.3%
R 855
 
0.3%
H 788
 
0.3%
Other values (16) 2807
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 127750
36.8%
7 70699
20.3%
2 33642
 
9.7%
5 32966
 
9.5%
3 28520
 
8.2%
4 17906
 
5.2%
6 13016
 
3.7%
0 10163
 
2.9%
8 7593
 
2.2%
9 5321
 
1.5%
Other Punctuation
ValueCountFrequency (%)
. 197518
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 98763
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 643857
68.5%
Latin 296289
31.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 95669
32.3%
U 90208
30.4%
S 89851
30.3%
C 6509
 
2.2%
N 6400
 
2.2%
E 1270
 
0.4%
M 1029
 
0.3%
X 903
 
0.3%
R 855
 
0.3%
H 788
 
0.3%
Other values (16) 2807
 
0.9%
Common
ValueCountFrequency (%)
. 197518
30.7%
1 127750
19.8%
_ 98763
15.3%
7 70699
 
11.0%
2 33642
 
5.2%
5 32966
 
5.1%
3 28520
 
4.4%
4 17906
 
2.8%
6 13016
 
2.0%
0 10163
 
1.6%
Other values (2) 12914
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 940146
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 197518
21.0%
1 127750
13.6%
_ 98763
10.5%
A 95669
10.2%
U 90208
9.6%
S 89851
9.6%
7 70699
 
7.5%
2 33642
 
3.6%
5 32966
 
3.5%
3 28520
 
3.0%
Other values (28) 74560
 
7.9%

level2Name
Text

Missing 

Distinct1522
Distinct (%)1.5%
Missing87766
Missing (%)47.1%
Memory size1.4 MiB
2025-01-08T18:32:48.310153image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length30
Mean length8.839798305
Min length3

Characters and Unicode

Total characters873045
Distinct characters96
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique397 ?
Unique (%)0.4%

Sample

1st rowNew London
2nd rowNew Haven
3rd rowColumbia-Shuswap
4th rowLitchfield
5th rowHartford
ValueCountFrequency (%)
new 27815
20.6%
haven 21687
16.1%
hartford 10598
 
7.9%
litchfield 8951
 
6.6%
fairfield 6391
 
4.7%
london 6094
 
4.5%
middlesex 4507
 
3.3%
windham 2098
 
1.6%
tolland 1936
 
1.4%
coos 1128
 
0.8%
Other values (1636) 43795
32.4%
2025-01-08T18:32:48.555806image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 97796
 
11.2%
a 78125
 
8.9%
n 61952
 
7.1%
i 58663
 
6.7%
o 50830
 
5.8%
d 50373
 
5.8%
r 46106
 
5.3%
l 36292
 
4.2%
36237
 
4.2%
t 35119
 
4.0%
Other values (86) 321552
36.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 695469
79.7%
Uppercase Letter 136523
 
15.6%
Space Separator 36237
 
4.2%
Dash Punctuation 3055
 
0.3%
Decimal Number 962
 
0.1%
Other Punctuation 761
 
0.1%
Close Punctuation 19
 
< 0.1%
Open Punctuation 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 97796
14.1%
a 78125
11.2%
n 61952
8.9%
i 58663
 
8.4%
o 50830
 
7.3%
d 50373
 
7.2%
r 46106
 
6.6%
l 36292
 
5.2%
t 35119
 
5.0%
w 29400
 
4.2%
Other values (39) 150813
21.7%
Uppercase Letter
ValueCountFrequency (%)
H 34544
25.3%
N 29570
21.7%
L 17533
12.8%
M 8839
 
6.5%
F 7664
 
5.6%
C 6778
 
5.0%
S 5466
 
4.0%
W 3577
 
2.6%
T 3191
 
2.3%
B 2699
 
2.0%
Other values (20) 16662
12.2%
Decimal Number
ValueCountFrequency (%)
1 492
51.1%
0 145
 
15.1%
5 105
 
10.9%
7 67
 
7.0%
6 66
 
6.9%
3 26
 
2.7%
9 24
 
2.5%
8 18
 
1.9%
2 13
 
1.4%
4 6
 
0.6%
Other Punctuation
ValueCountFrequency (%)
. 612
80.4%
' 148
 
19.4%
/ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
36237
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3055
100.0%
Close Punctuation
ValueCountFrequency (%)
) 19
100.0%
Open Punctuation
ValueCountFrequency (%)
( 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 831992
95.3%
Common 41053
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 97796
 
11.8%
a 78125
 
9.4%
n 61952
 
7.4%
i 58663
 
7.1%
o 50830
 
6.1%
d 50373
 
6.1%
r 46106
 
5.5%
l 36292
 
4.4%
t 35119
 
4.2%
H 34544
 
4.2%
Other values (69) 282192
33.9%
Common
ValueCountFrequency (%)
36237
88.3%
- 3055
 
7.4%
. 612
 
1.5%
1 492
 
1.2%
' 148
 
0.4%
0 145
 
0.4%
5 105
 
0.3%
7 67
 
0.2%
6 66
 
0.2%
3 26
 
0.1%
Other values (7) 100
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 871868
99.9%
None 1156
 
0.1%
Latin Ext Additional 21
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 97796
 
11.2%
a 78125
 
9.0%
n 61952
 
7.1%
i 58663
 
6.7%
o 50830
 
5.8%
d 50373
 
5.8%
r 46106
 
5.3%
l 36292
 
4.2%
36237
 
4.2%
t 35119
 
4.0%
Other values (59) 320375
36.7%
None
ValueCountFrequency (%)
é 551
47.7%
ô 244
21.1%
á 150
 
13.0%
í 64
 
5.5%
ó 39
 
3.4%
ñ 30
 
2.6%
ú 14
 
1.2%
ł 13
 
1.1%
Đ 7
 
0.6%
ö 7
 
0.6%
Other values (13) 37
 
3.2%
Latin Ext Additional
ValueCountFrequency (%)
7
33.3%
7
33.3%
5
23.8%
2
 
9.5%

level3Gid
Text

Missing 

Distinct728
Distinct (%)9.5%
Missing178900
Missing (%)95.9%
Memory size1.4 MiB
2025-01-08T18:32:48.748311image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length13
Mean length12.10446979
Min length11

Characters and Unicode

Total characters92345
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique262 ?
Unique (%)3.4%

Sample

1st rowCAN.2.8.6_1
2nd rowGBR.3.1.1_1
3rd rowCAN.7.18.4_1
4th rowCAN.11.87.11_1
5th rowFRA.13.2.1_1
ValueCountFrequency (%)
can.7.17.2_1 1048
 
13.7%
can.2.3.5_1 207
 
2.7%
can.9.35.1_1 173
 
2.3%
chn.14.9.8_1 163
 
2.1%
can.11.88.5_1 163
 
2.1%
gbr.1.20.1_1 153
 
2.0%
can.2.9.12_1 152
 
2.0%
can.2.8.2_1 136
 
1.8%
can.11.58.5_1 135
 
1.8%
chn.13.8.1_1 130
 
1.7%
Other values (718) 5169
67.8%
2025-01-08T18:32:49.009040image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 22887
24.8%
1 17043
18.5%
_ 7629
 
8.3%
C 6398
 
6.9%
N 6301
 
6.8%
A 5812
 
6.3%
2 5632
 
6.1%
7 3254
 
3.5%
3 3181
 
3.4%
8 2209
 
2.4%
Other values (24) 11999
13.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 38942
42.2%
Other Punctuation 22887
24.8%
Uppercase Letter 22887
24.8%
Connector Punctuation 7629
 
8.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 6398
28.0%
N 6301
27.5%
A 5812
25.4%
R 784
 
3.4%
H 780
 
3.4%
B 491
 
2.1%
G 477
 
2.1%
U 360
 
1.6%
E 354
 
1.5%
F 347
 
1.5%
Other values (12) 783
 
3.4%
Decimal Number
ValueCountFrequency (%)
1 17043
43.8%
2 5632
 
14.5%
7 3254
 
8.4%
3 3181
 
8.2%
8 2209
 
5.7%
5 2177
 
5.6%
4 1946
 
5.0%
9 1472
 
3.8%
6 1104
 
2.8%
0 924
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 22887
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7629
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 69458
75.2%
Latin 22887
 
24.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 6398
28.0%
N 6301
27.5%
A 5812
25.4%
R 784
 
3.4%
H 780
 
3.4%
B 491
 
2.1%
G 477
 
2.1%
U 360
 
1.6%
E 354
 
1.5%
F 347
 
1.5%
Other values (12) 783
 
3.4%
Common
ValueCountFrequency (%)
. 22887
33.0%
1 17043
24.5%
_ 7629
 
11.0%
2 5632
 
8.1%
7 3254
 
4.7%
3 3181
 
4.6%
8 2209
 
3.2%
5 2177
 
3.1%
4 1946
 
2.8%
9 1472
 
2.1%
Other values (2) 2028
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 92345
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 22887
24.8%
1 17043
18.5%
_ 7629
 
8.3%
C 6398
 
6.9%
N 6301
 
6.8%
A 5812
 
6.3%
2 5632
 
6.1%
7 3254
 
3.5%
3 3181
 
3.4%
8 2209
 
2.4%
Other values (24) 11999
13.0%

level3Name
Text

Missing 

Distinct722
Distinct (%)9.5%
Missing178901
Missing (%)95.9%
Memory size1.4 MiB
2025-01-08T18:32:49.202587image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length13.07236497
Min length3

Characters and Unicode

Total characters99716
Distinct characters99
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique255 ?
Unique (%)3.3%

Sample

1st rowColumbia-Shuswap E
2nd rowAberdeen
3rd rowYarmouth Town
4th rowPont-Rouge
5th rowGrasse
ValueCountFrequency (%)
subd 1358
 
8.8%
b 1147
 
7.5%
victoria 1108
 
7.2%
no 454
 
3.0%
division 344
 
2.2%
c 287
 
1.9%
h 282
 
1.8%
capital 230
 
1.5%
part 224
 
1.5%
a 213
 
1.4%
Other values (819) 9726
63.3%
2025-01-08T18:32:49.451806image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8445
 
8.5%
7745
 
7.8%
i 6940
 
7.0%
o 6927
 
6.9%
n 5507
 
5.5%
e 5270
 
5.3%
r 4860
 
4.9%
t 4774
 
4.8%
u 4100
 
4.1%
l 3351
 
3.4%
Other values (89) 41797
41.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 69753
70.0%
Uppercase Letter 15729
 
15.8%
Space Separator 7745
 
7.8%
Other Punctuation 3541
 
3.6%
Dash Punctuation 1261
 
1.3%
Decimal Number 1022
 
1.0%
Open Punctuation 329
 
0.3%
Close Punctuation 326
 
0.3%
Final Punctuation 10
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8445
12.1%
i 6940
9.9%
o 6927
9.9%
n 5507
 
7.9%
e 5270
 
7.6%
r 4860
 
7.0%
t 4774
 
6.8%
u 4100
 
5.9%
l 3351
 
4.8%
s 2470
 
3.5%
Other values (39) 17109
24.5%
Uppercase Letter
ValueCountFrequency (%)
S 2629
16.7%
B 1955
12.4%
C 1816
11.5%
V 1358
 
8.6%
D 800
 
5.1%
N 747
 
4.7%
L 669
 
4.3%
A 604
 
3.8%
T 572
 
3.6%
H 546
 
3.5%
Other values (20) 4033
25.6%
Decimal Number
ValueCountFrequency (%)
1 330
32.3%
2 245
24.0%
0 154
15.1%
9 76
 
7.4%
7 66
 
6.5%
6 60
 
5.9%
3 31
 
3.0%
4 28
 
2.7%
8 21
 
2.1%
5 11
 
1.1%
Other Punctuation
ValueCountFrequency (%)
. 1855
52.4%
, 1615
45.6%
' 65
 
1.8%
/ 4
 
0.1%
* 2
 
0.1%
Space Separator
ValueCountFrequency (%)
7745
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1261
100.0%
Open Punctuation
ValueCountFrequency (%)
( 329
100.0%
Close Punctuation
ValueCountFrequency (%)
) 326
100.0%
Final Punctuation
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 85482
85.7%
Common 14234
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8445
 
9.9%
i 6940
 
8.1%
o 6927
 
8.1%
n 5507
 
6.4%
e 5270
 
6.2%
r 4860
 
5.7%
t 4774
 
5.6%
u 4100
 
4.8%
l 3351
 
3.9%
S 2629
 
3.1%
Other values (69) 32679
38.2%
Common
ValueCountFrequency (%)
7745
54.4%
. 1855
 
13.0%
, 1615
 
11.3%
- 1261
 
8.9%
1 330
 
2.3%
( 329
 
2.3%
) 326
 
2.3%
2 245
 
1.7%
0 154
 
1.1%
9 76
 
0.5%
Other values (10) 298
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 99345
99.6%
None 339
 
0.3%
Latin Ext Additional 22
 
< 0.1%
Punctuation 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8445
 
8.5%
7745
 
7.8%
i 6940
 
7.0%
o 6927
 
7.0%
n 5507
 
5.5%
e 5270
 
5.3%
r 4860
 
4.9%
t 4774
 
4.8%
u 4100
 
4.1%
l 3351
 
3.4%
Other values (61) 41426
41.7%
None
ValueCountFrequency (%)
é 132
38.9%
è 77
22.7%
Î 28
 
8.3%
ä 21
 
6.2%
É 14
 
4.1%
ł 13
 
3.8%
ñ 9
 
2.7%
ú 7
 
2.1%
ư 6
 
1.8%
ơ 6
 
1.8%
Other values (12) 26
 
7.7%
Latin Ext Additional
ValueCountFrequency (%)
ế 13
59.1%
5
 
22.7%
2
 
9.1%
1
 
4.5%
1
 
4.5%
Punctuation
ValueCountFrequency (%)
10
100.0%

iucnRedListCategory
Text

Missing 

Distinct9
Distinct (%)< 0.1%
Missing10881
Missing (%)5.8%
Memory size1.4 MiB
2025-01-08T18:32:49.509809image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters351296
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNE
2nd rowNE
3rd rowLC
4th rowNE
5th rowNE
ValueCountFrequency (%)
ne 150656
85.8%
lc 24131
 
13.7%
dd 199
 
0.1%
nt 177
 
0.1%
cr 177
 
0.1%
en 175
 
0.1%
vu 127
 
0.1%
ex 5
 
< 0.1%
ew 1
 
< 0.1%
2025-01-08T18:32:49.608591image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 151008
43.0%
E 150837
42.9%
C 24308
 
6.9%
L 24131
 
6.9%
D 398
 
0.1%
T 177
 
0.1%
R 177
 
0.1%
V 127
 
< 0.1%
U 127
 
< 0.1%
X 5
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 351296
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 151008
43.0%
E 150837
42.9%
C 24308
 
6.9%
L 24131
 
6.9%
D 398
 
0.1%
T 177
 
0.1%
R 177
 
0.1%
V 127
 
< 0.1%
U 127
 
< 0.1%
X 5
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 351296
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 151008
43.0%
E 150837
42.9%
C 24308
 
6.9%
L 24131
 
6.9%
D 398
 
0.1%
T 177
 
0.1%
R 177
 
0.1%
V 127
 
< 0.1%
U 127
 
< 0.1%
X 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 351296
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 151008
43.0%
E 150837
42.9%
C 24308
 
6.9%
L 24131
 
6.9%
D 398
 
0.1%
T 177
 
0.1%
R 177
 
0.1%
V 127
 
< 0.1%
U 127
 
< 0.1%
X 5
 
< 0.1%